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(54) Method and apparatus for special video reproduction mode3 



(57) A special reproduction control Information com- 
prises plurality of Items (100) of frame infonnation. Each 
of the items ot frame information comprises video loca- 



tion infomnation (1 01 ) indicating the location of video da- 
ta to be reproduced in a special reproduction and display 
time control infonmation (102) indicating the time for dis- 
playing the video data. 
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Description 

[0001] The preseni Invention relates to a special re- 
production control infonnation describing method forde- 
scrtbinfl special reproducUon control information used 
to perform special reproduction for target video con- 
tents, a special reproduction control information creat- 
ing method for creating the special reproduction control 
information and a special reproduction control informa- 
tion creating apparatus and a video reproduction appa- 
ratus and method for performing special reproduction 
by using the special reproduction control Information. 
[0002] In recent years, a motion piclure is com- 
pressed as a digital video and is stored in disk media 
represented by a DVD, and a HDD so that a video can 
be reproduced at random. A video can be reproduced 
halfway from a desired liming in the state of virtually no 
waiting time. As In conventional tape media, disk media 
can be fast reproduced al two to four times speed or can 
be reversely reproduced. 

[0003] However, there Is a problem in that the length 
of a video can be very long In many cases, and time 
cannot be sufficiently compressed to view the whole 
contents of the video even at two to four times fast re- 
production. When the rate of the fast reproduction is in- 
creased, the scene change is enlarged to a degree ex- 
ceeding the ability to view it, so that grasping the con- 
tents IS difficult, and even portions which are not needed 
are also reproduced so that waste is caused. 
10004] Accordingly, the present invention Is directed 

to method and apparatus that substantially obviates one 
or more of the problems due to limitations and disad- 
vantages of the related art. 

10005] According to one aspect of the present Inven- 
tion, a method of describing frame Information compris- 
eSt 
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comprises: 

a unit configured to extract a frame from a plurality 
of frames in a source video data; 
a unit configured to create the frame information in- 
cluding first information specifying a location of the 
extracted frame and second infonnalion relating to 
a display time of the extracted frame; and 
a unit configured to link the extracted frame to the 
frame Information. 

[0008] According to another aspect of the present in- 
vention, a method of creating frame Information com- 
prises: 

extracting a frame from a pluralliy of frames In a 
source video data; and 

creating the frame Infonnation including first Infor- 
mation specifying a location of the extracted frame 
In the source video data and second information re- 
lating to a display time of the extracted frame. 



describing, for a frame extracted from a plurality of 
frames in a source video data, first information 
specifying a location of the extracted frame In the 
Source video data; and 

describing, for the extracted frame, second Infonna- 
• tlon relating to a display time of the extracted frame. 

[0006] According to another aspect of the present in- 
vention, an article of manufacture comprising a compu- 
ter usable medium storing frame information, the frame 
Infomnation comprises: 

first infomnation, described for a frame extracted 
from a plurality of frames, specifying a iocation of 

. i^_?:f^^^ed4ram_eJn-the-sour^eJ^ideo-^ 

second InfoirnatTon. descriBe^ WW extracted' 
frame, relating to a display time of the extracted 
frame. 



[0009] According to another aspect of the present in- 
vention, an apparatus forperfonning a special roproduc- 
^ tion comprises: 

. a unit configured to refer to frame infomnation de- 
scribed for a frame extracted from a plurality of 
frames in a source video data and Including first in- 
^ fonnation specifying a location of the extracted 
frame in the source video data and second infonna-. 
tion relating to a display time of the extracted frame; 
a unit configured to obtiain the video data corre- 
sponding to the extracted frame based on the first 
55 Information; 
s ^nit configured to determine the display time of 

"nh^-extna^tBtltramBl3BSBtl wthre-sBcoTTd 1nf orrTia^ 
tlon; and 

a unit configured to display the obtained video data 
for the detemiined display time. 

[0010] According to another aspect of the present in- 
vention, an article of manufacture comprising a method 
of perfomilng a special reproduction cohlprises: 



40 



45 

referring to frame Infonnation described for a frame 
extracted from a plurality of frames in a source video 
data and including first infon-nation specifying a io- 
cation of the extracted frame and second informa- 
50 tion relating to a display time of the extracted frame; 
* obtaining the video data corresponding to the ex- 
5^^!?.^."^^^"^^^^^^-^^^^^f'-stJnformation;- 



10007] According to another aspect of the present in- 
vention, an apparatus for creating frame infoimation 



determining tlie""display time of the eidracted "f rame 
based on the second infonnation; and 
55 displaying the obtained- video data for the deter- 
mined display time. 

[0011] According to another aspect of the present in- 
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vention, an article of manufaclure comprising an article 
of manufaclure comprising a computer usable medium 
having computer readable program code means em- 
bodied therein, the computer readable program code 
means performing a special reproduction, tlie computer 
readable program code means comprises: 

computer readable program code means for caus- 
ing a computer to refer to frame Infomnation de- 
scribed for a frame extracted from a plurality of 
frames in a source video data and including first in- 
formation specifying a location of the extracted 
frame and second Information relating to a display 
time of the extracted frame; 
computer readable program code means for caus- 
ing a computerto obtain the video data correspond- 
ing to the extracted frame based on the first Infor- 
mation; 

computer readable program code means for caus- 
ing a computer to determine the display time of the 
extracted frame based on the second infomnation; 
and 

computer readable program code means for caus- 
ing a computer to display the obtained video data 
for the determined display time. 

[001 2] According to another aspect of the present in- 
vention, an article of manufacture comprising a method 
ot describing sound information, the method comprises: 

describing, for a frame extracted from a plurality of 
sound frames in a source sound data, first infomna- 
tion specifying a location of the extracted frame in 
the source sound data; and 
describing, forthe extracted frame, second informa- 
^tlppj:elating.tojaj.eprAdMction.st^^^^ 



duction time of the sound data of the extracted 
frame. 

[0013] According to another aspect of the present in- 
vention, an article of manufacture comprising an article 
of manufacture comprising a computer usable medium 
storing frame infomiation, the frame infomation com- 
prises: 



first information, described for a frame extracted 
from a plurality of sound frames, specifying a loca- 
tion of the extractedf rame in the source sound data; 
and 

second Information, described for the extracted 

frame, relating to a reproduction start time and re- 

-production time of- the^ound -data -of-the exFacted ing apparaTOsr.rzr 

frame. 



describing, for a frame extracted from a plurality of 
text frames in a source text data, first infoiTnation 
specifying a location of the extracted frame in the 
source text data; and 
5 describing, for the extracted frame, second informa- 
tion relating to a display start lime and display time 
of the text data of the extracted frame. 

[0.01 5J According to another aspect of the p resent In- 
to vention, an article of manufacture comprising an article 
of manufacture comprising a computer usable medium 
storing frame information, the frame information com- 
prises: 

15 first Information, described for a frame extracted 
from a plurality of text frames in a source text data, 
specifying a location of the extracted frame In the 
source text data; and 

second informalion. described for the extracted 
20 frame, relating to a display start time and display 
time of the text data of the extracted frame. 

[0O16] This summary of the invention does not nec- 
essarily describe all necessary features so that the in- 
25 vention may also be a sub-combination of these -de- 
scribed features, 

[D017] The presept invention can be implemented ei- 
ther in hardware or on software In a general purpose 
computer. Further the present Invention can be imple- 
30 mented in a combination of hardware and software. The 
present invention can also be Implemented by a single 
processing apparatus or a distributed networic of 
processing apparatuses. 

10018] Since the present Invention can be implement- 
35 ed by software, the present invention encompasses 

computer code provided to a g enera! p ur pos e com puter 

on any suitable carrier medium. The carrier medium can 
comprise any storage medium such as a floppy disl<, a 
CD ROM. a magnetic device or a programmable mem- 
40 ory device, or any transient medium such as any signal 
e.g. an electrical, optical or microwave signal. 
[0019] The invention can be more fully understood 
from th^ following detailed description when talcen in 
conjunction with the accompanying drawings, in which: 

45 

FIG. 1 is a view showing an example of a data struc- 
ture of special reproduction control Information ac- 
coiding to one embodiment of the present inven- 
tion; 

so FIG. 2 is a view showing an example of a structure 
of a special reproduction control information creat- 



[0014] According to another aspect of the present in- 
vention, an article of manufacture comprising a method 
of describing text information, the method comprises: 



FiG. 3 is a view showing another example of struc- 

ture -of-th e-special-reproduetio n-eontro l-inf ormation- • 

55 creating apparatus; 

FIG. 4 is a flowchart showing one example forthe 

apparatus. shown in FIG. 2; 

FIG. 5 Is a flowchart showing one example forthe 
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apparatus shown in FIG. 3; 
FIG, 6 is a view showing an example of a structure 
of a video reproduclion apparatus; 
FIG. 7 is a flowchart showing one example for the 
apparatus shown in FIG. 6; 5 
FIG. 8 is a view showing an example of a data struc- 
ture of special reproduclion control infoimation- 
FIG. 9 is a view explaining video location infonria- 
tion for referring to an original video frame; 
FIG. 1 0 is a view explaining video location infomia- 10 
tion for referring to a image data file; 
FIG. 11 is a view explaining a method for extracting 
video data in accordance with a motion of a screen* 
FIG. 12 is a view explaining video location infomia- 
tfon for refening to the original video frame; is 
FIG. 13 Is a view for explaining video location infor- 
matlon for referring to the Image data nie; 
FIG. 14 is a view showing an example of a data 
stmclure of special reproduclion control information 
in which piural original video f jBmes are refen^ed to; 20 
FIG. 15 is a view explaining a relation between the 
video location infomiation and the original plural 
video frames; 

FIG. 1 6 is a view explaining a relation between the 
irnage data file and the original plural video frames- 25 
FIG. 17 js a view explaining video location informal 
tion for refemng to the original video frame; 
FIG. 1 8 is a view for explaining video location infor- 
mation for refem'ng to the Image data file; 
FIG. 1 9 is a flow chart for explaining a special re- 3o 
production; 

FIG, 20 is a viewfor explaining a method for extract- 
ing video data in accordance with a motion of a 
screen; 

FIG.21 is a viewforexplaintngamethod for extract- 35 
mgj^deojiat^ with a motion of a 



screen; 

FIG. 22 is a flowchart showing one example for cal- 
culating display time at which a scenechange quan- 
tity becomes constant as much as possible; 4o 
FIG. 23 is a flowchart showing one exampi e'f or cal- 
culating a scene chahge quantity of the whole frame 
from an MPEG video; " 

FIG, 24 is a view for explaining a method for calcu- 
ating a scene change quantity of a video from an 45 
MPEG stream; 

FIG, 25 is a view for explaining a processing proce- 
dure for calculating display time at which a scene 
change quantity becomes constant as much as pos- 

FIG. 26 IS a flowchart showing one example of the 
_pjoe_essing;pj5c_edu^^^^^ 

ductinn AnthaWa^u ^* j_i **" . ~ :: 



duction on the basis of special reproduction'control 
infonnation; 

FIG. 27 is a flowchart showing one example forcon- 55 
ducting special reproduclion on the basis of a dis- 
play cycle; 

FIG. 28 is a view for explaining a relationship be- 



tween a calculated display time and the display cy- 
cle; 

FIG. 29 is a view for explaining a relationship be- 
tween a calculated display time and the display cy- 
cle; 

FIG . 30 is a view s ho win g another example of a data 
structure of special reproduction control Informa- 
tion; 

FIG. 31 is a view explaining a method for extracting 
video data in accordance with a motion of a screen; 
FIG. 32 is a view explaining video location Infonna- 
tion for referring to the original video frame; 
FIG, 33 Is a view showing another example of a dala 
structuns of special reproduction control informa- 
tion; 

FIG. 34 is a view showing another example of a dala 
structure of special reproduction control Infonna- 
tion; 

FIG . 35 is a view showing anolher example pf a data 
structure of special reproduction control informa- 
tion; 

FIG. 36 is a flowchart showing one example for cal- 
culating display time from the importance; 
FIG . 37 Is a view for explaining a method for calcu- 
lating display time from the importance; 
FIG. 38 is a flowchart showing one example for cal- 
culating importance data on the basis of the idea 
that a scene having a large sound level is important; 
FIG. 39 is a flowchart showing one example for cal- 
culating importance data on the basis of the idea 
thatascene oh which many important words appear 
with sound recognition is important, or a processing 
procedure for calculating importance data on the 
basis of the ideathatthe scene In which the number 
of words tallced per time is many is important; 
FIGJOte^aflpwcha^^ oneexjmple^^^ 



culating Importance data on the basis of the idea 
thatascene on which many important words appear 
with telop recognition is important, or a processing 
procedure for calculating importance data on the 
basis of the idea that the scene in which the number 
of words included in the telop which appears per 
time is large with telop recognition Is important; 
FIG. 41 Is a flowchart showing one example for cal- 
culating Importance data on the basis of the idea 
mat the scene in which a large character appears 
as a telop is important; 

FIG . 42 is a flowchart showing one example for cal- 
culating importance data on the basis of the idea 
that the scene in which many human faces appear 
IS important or a processing for calculating impor- 
tance ciatcofnh:i:Bar!§:of3rcrrfe^^ 



•'•=-»=:5=a!-.-sijsrfi,Af|v-_«U<.«M!5.. 

where human faces are displayed in an enlarged 
manner is important; 

FIG. 43 is a flowchart showing one example for cal- 
culating importance data on the basis of the idea 
that the scene in which videos similar to the regis- 
tered rnportant scene appear is important; 
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FIG. 44 Is fl view showing another example of a data 
structure of special reproduction control informa- 
tion; 

FIG. 45 is a view showing another example of a data 
structure of special reproduction control Informa- 
tion; 

FIG, 46 is a vlewshowing anotherexample of a data 
structure of special reproduction control informa- 
tion; 

FIG. 47 Is a view for explaining a relationship be- 
tween information as to whether the scene is to be 
reproduced or not and the reproduced video; 
FIG. 48 is a flowchart showing one example of a 
processing procedure of special reproduction in- 
cluding reproduction and non-reproduction judg- 
ment; 

FIG. 49 is a view showing one example of a data 
structure when sound infonnatlon or text Infomna- 
lion is added; 

FIG. 50 is a view showing one pxampie of a data 
structure for describing only sound information sep- 
arately from frame inf onnation; 
FIG. 51 Is a view showing one 6iiample of a data 
structure for describing only text information sepa- 
rately from frame infomriation; 
FIG. 52 is a view for explaining a synchronization 
of a reproduction of each of media; 
FIG. 53 is a flowchart showing one example of a 
determination procedure of a sound reproduction 
start time and a sound reproduction time in a video 
frame section; 

FIG. 54 is afiowchart showing one example for pre- 
paring reproduction sound data and correcting vid- 
eo frame display time; 

FIG, 55 Is a flowchart showing one example of a 

process i ng procedure of obt aining text information 

with telop recognition; 

FIG. 56 is a flowchart showing one example of a 
processing procedure of obtaining text information 
with sound recognition; 

FIG. 57 is a flowchart showing one example of a 
• processing procedure of preparing text infomriation; 
FIGS. 58A and 58B are views for explaining a meth- 
od of displaying text infomnation; 
FIG, 59 Is a view showing one example of a data 
structure of special reproduction control Information 
for sound information; 

FIG. 60 is a view showing anotherexample of a data 
structure of special reproduction control information 
for sound information; 

FIG. 61 is a view explaining a summary reproduc- 
tio n of -th e ^ou n d/mus icIiatar^Tici . _ 

RG. 62 is a view explaining another summary re- 
.... — prjpjtjuctlonpfthffsoi^^ — 7; — ; — 7*~; 

[0020] Preferred embodiments of the present Inven- 
tion v/ili now be described with reference to the accom- 
panying drawings. 



[0O21] The embodiments relate to a reproducllon of 
video contents having video data using special repro- 
duction control infom^ation. The video data comprises 
. a set of video frames (video frame group) constituting a 
5 motion picture. 

10022] The special reproduction control infomtation is 
created from the video data by a special reproduction 
control information creating apparatus and attached to 
the video data. The special reproduction is reproduction 
10 by a method other than a nomnal reproduction. The spe- 
cial reproduction includes a double speed reproduction 
(or a high speed reproduction), jump reproduction (or 
jump continuous reproduction), and a trici< reproduction. 
The trick reproduction includes a substituted reproduc- 
es tion, an overlapped reproduction, a slow reproduction 
and the like. The special reproduction control informa- 
tion is referred to when the special reproduction Is exe- 
cuted in the video reproduction apparatus. 
[0023] FIG. 1 shows one example of a basic data 
20 structure of the special reproduction contr"ol infomiatlon. 
[0024] In this data structure, plural items of frame in- 
formation "i" (i = 1 to N) are described in correspondence 
to the frame appearance order in the video data. Each 
frame infomriation 100 includes a set of vidoo location 
25 information 101 and display time control information 
102. The video location infonnatlon 101 Indicates a lo- 
cation of video data to be displayed atthetimeo! special 
reproduction. The video data to be display may be one 
frame, a group of a plurality of continuous frames, or a 
30 group fomned of a part of a plurality of continuous 
frarnes. The display time control information 1 02 forms 
the basis of calculating the display time of the video da- 
ta. 

[0025] In FIG. 1 . the frame Infonnatlon "i" is arranged 
35 in an order of the appearance of frames in the video da- 
ta. Wh en Inf omnation indicating an order of frame Infor- 



mation is described in the frame information "i", the 
frame Infonmatlon "i" may be arranged and described in 
any order. 

40 [002B] The reproduction rate information 103 at- 
tached to a plurality of items of frame information "i" 
shows the reproduction speed rate and is used for des- 
ignating the reproduction at a speed several times high- 
er than that corresponding to the display time as de- 
45 scribed by the display time control inf onnation 10^. 
However, the reproduction rate infomnatton 103 is not 
essential Information, The information 103 may con- 
stantly be attached, not constantly be attached, or se- 
lectively attached. Even when the reproduction rate in- 
50 fonnation 103 is attached, the Infomnation may not be 
used at the time of special reproduction. The reproduc- 
ZriZZSeirrStKintofroatio 

• constantly used, or is selectively used, 

" [0027]— InFIGH-j-it-isposslbleto-further-addother-eon— 

55 trol information to the frame information group together 
with the reproduction rate information or in place of the 
reproduction rate information. In FIG. 1 , it is also possi- 
ble to add different control information to each frame in- 
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formalion T. in these cases, each informallon included 
in thG special reproduction control information may be 
all used on the side of the video reproduction device, or 
a part of the informalion inay be used, 
[0028] FIG, 2 shov/s an example of a structure of an 
apparatus for creating special reproduction control in- 
formation. 

[0029] This special reproduction control information 
creating device comprises a video data storage unit 2, 
a video data processing unit 1 1ncluding a video locatiori 
infonnation processing unit 1 1 and a display time control 
- information processing unit 12, and a special reproduc- 
tion control Information storage unll 3, In detail, as w\\] 
be described later, since the video data (encoded data) 
is decoded to be video data before displaying, it takes 
a processing time required for decoding the video data 
from the display instruction Is issued until the video is 
displayed. In order to extracted this processing time, it 
is proposed to decode the video data beforehand and 
store an image data file. 

[0030] If an image data file is used (the image data 
file may be constantly used, or the image data file is se- 
iectively used), an image data file creating unit 13 (In 
the video data processing unit 1 ) and an image data file 
storage unit 14 are further provided as shovi^n in FIG, 3, 
If other control Intormation is added which is detennlned 
on the basis ofthe video data to the special reproduction 
control information, thecorrespondlngfunction Is appro- 
priately added to the inside ofthe video data processina 
unit1. ^ 

[0031] If an operation by a user is intervened in this 
processing, a GUI is used for displaying, for example, 
video data in frame units, and providing a function of 
receiving an input of an instmction by the user though 
omitted in FIGS, 2 and 3. 
J0Q322_jn FIGS. 2 a nd 3, a CPU, a m emon^r^anexter- 
nal storage device, and a network communication de^ 
vice Is provided when needed, and software such as 
dnver software used when needed and an OS are not 
shown. 

[0033] The video data storage unit2 stores video data 
which becomes an target of processing for creating spe- 
cial reproduction control Information (or special repro- 
duction control infonnation and image data flies). 
[0034] The special reproduction control Information 
storage unit 3 stores special reproduction control infor- 
mation that has been created. 
[0035] The image data file'storage unit 4store5 image 
data files that have been created. 
[0036] The storage units 2, 3, and 4 comprise, for ex- 
ample, a hard disl<, an optical disk a,nd a semiconductor 
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separate storage devices. All or part of the storage uriifs 
may comprise the same storage device. 
[0037] The video data processing unit 1 creates the 
special reproduction control information (or the special 
reproduction control information and image data file) on 
the basis of the video data which becomes an target of 



processing. 

[0038] The video location informalion processing unit 
11 determines (extracts) a video frRme (group) which 
should be displayed or which can be displayed at the 
time of special reproduction to conduct processing of 
preparing the video location infonnation 101 which 
should be described in each frame infonnation "i". 
[0039] The display Ume control infonnation process- 
ing unit 1 02 conducts a processing for preparing the dis- 
play time control Information 102 associated with the 
display time of the video frame (group) associated with 
each frame infonnation "i". 

[0040] The image data file creating unit 1 3 conducts 
a processing for preparing animage data file from the 
video data. 

[0041] The special reproduction control Infonnation 
creating apparatus can be realized, for example, In a 
fonn of conducting software on a computer. The appa- 
ratus may be realii^ed.as h dedicated apparatus for cre- 
ating the special reproduction control information. 
[0042] FIG. 4 shows an example of a processing pro- 
cedure in a case of a structure of FIG. 2. The video data 
is read (step S11), video location informaton 1 01 is cre- 
ated (step S12), display time control information 102 Is 
created (step 813), and special reproduction control In- 
formation Is stored (step S14). The procedure of FIG. 4 
may be consecutively conducted for each frame infor- 
mation, and each processing may be conducted in 
batches. The other procedures can also be conducted. 
[0043] FIG. 5 shows an example of a processing pro- 
cedure in a case of the structure of FIG. 3. A procedure 
for preparing and storing image data files is added to a 
procedure of FIG. 4 (step 822), The image data file is 
created and/or stored together with the preparation of 
the video location infonnation 1 01 . It is also possible to 
cr eate the video location info rm ation 1 01 at a timin g dif- 
ferent from that of FIG. 4. In the same manner as the" " 
case of FIG, 4, the procedure of FIG. 5 may be conduct- 
' ed for each frame infonnation, or may be conducted in 
batches. The other procedures can also be conducted. 
[0044] FIG. 6 shows an example of a video reproduc- 
tion apparatus. 

[0045] This video reproduction apparatus comprises 
a controller 21, a normal reproduction processing unit 
22, a special reproduction processing unft23, a display 
device 24, and a contents storage unit 25. If contents 
are handled wherein audio such as sound or the like is 
added to the video data, it is preferable to provide a 
sound output section. If contents are handled wherein 
text data is added to the video data, the text may be 
displayed on the display device 24, or may be output 
frb7tnh>_s'oimdi);D-:pm^ _ 
wherein a program Is attached, an attached program ex- 
ecution section may be provided. 
55 [0046] The contents storage unit 25 stores at least 
video data and special reproduction control information. 
In detail, as will be described later, in the case where 
the innage data file Is used, the Image data file is further 
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stored. The sound data: the text data, and the attached 
program are further stored in some cases. 
f0047] The contents storage unit 25 may be arranged 
atone location in a concentrated manner, or may be ar- 
ranged in a distributed manner. The point is th at tiie con- 
tents can be accessed with the nomnal reproduction 
processing unit 22 and special reproduction processing 
unit 23. The video data, special reproduction control in- 
formation, image data files, sound data, text data, and 
attached program may be stored in separate media or 
may be stored In the same medium. As the medium, for 
example, DVD Is used. These may be data which are 
Iransnnitted via a network. 

[0048] The controlierZI basically receives an instruc- 
tion such as a normal reproduction and a special repro- 
duction with respect to the contents from the user via a 
user Inierface such as a GUI or the like. The controller 
21 controls for giving to the coiresponding processing 
unit HP Instruction of reproduction by means of a method 
designated with respect to the designated contents, 
[0049] The norma! reproduction processing unit 22 is 
used for the normal reproduction of the designated con- 
tents. 

[0050] The special reproduction processing unit 23 is 
used for the special reproduction (for example, a high 
speed reproduction, Jump reproduction, trick reproduc- 
tion, or the like) of the designated contents by refenring 
to the special reproduction control Infonnation. 
[0051] The display device 24 Is used for displaying a 
video. 

[0052] The video reproduction apparatus can be real- 
ized by computer software. It may partially be realized 
by hardware (for example, decode board (MPEG-2 de- 
coder) or the like). The video reproduction apparatus 
may be realized as a dedicated device for video repro- 
duction. 



10 



15 



20 



25 



[0053] FIG. 7 shows one example of a reproduction 
processing procedure of the video reproduction appa- 
ratus of FIG. 6. At step S31, it is detemiined whether 
user requests a normal reproduction or a special repro- 
duction. When a normal reproduction is requested, the 
designated video data is read al step S32 and a normal 
reproduction is conducted at step S33. When a special 
reproduction is requested from the user, the special re- 
production control inlormation corresponding to the des- 
ignated video data is read at step S34, the location of 
the video data to be displayed is specified and the dis- 
play lime is detemiined at step S35. The corresponding 
frame (group) is read from the video data (or the Image 
data file) at step S36 to conduct special reproduction of 
the designated contents at step S37.The location of the 

I'3T3eaaata canbe-spjeclfiedra^^^ 

determined at a timing difforentf rom that in FIG. 7. The 

' • *'7pro'cecfur6~^tthe~sp"^^^^^ be- 
consecutively conducted for each frame information, or 
each processing may be conducted in batches. Other 
procedures can be conducted. For example, in the case 
of the reproduction method in which the display time of 



each frame Is equally set to a constant value, it Is not 
necessary to determine the display time. 
[0054] Both in the nomnal reproduction and In the spe- 
cial reproduction, the user may demand various desig- 
nations (for example, the start point of the reproduction 
or the end point of the reproduction in the contents, a 
reproduction speed in the high speed reproduction, and 
reproduction time in the high speed reproduction, and 
other method, such as special reproduction or the like), 
[0055] Next, ah algorithm for creating the frame infor- 
mation of the special reproduction control infomnation 
and an algorithm for calculating the display time of the 
special reproduction will be schematically explained. 
[0056] At the time of creating the frame Information, 
the frame information to be used atthe time of the spe- 
cial reproduction Is detemnined from the video data, the 
video location infomnation Is created, and the display 
time control information Is created. 
[0057] The frame Is determined by such methods as; 
1) a method for calculating the video frame on the basis 
of some characteristic quantity with respect to the video 
data (for example, a method for extracting the video 
frames such that the total of characteristic quantity (for 
examplQ, the scene change quantity) between the ex- 
tracted frames becomes constant and a methodfoc ex- 
tracting the video frames such that the total of impor- 
tance between the extracted frames becomes con- 
stant), and (2) a method for calculating the video frame . 
on afixfidstandard(fQrexample,amethodfor extracting 
frames at random, and a method for extracting frames 
at an equal interval). The scene change quantity is also 
called as a frame activity value. 
[0058] In the creation of the display time opntrpj intpr- ' 
matlon 121 , there are available; (i) a method for calcu- 
lating an absolute value or a relative value of the display 
time or a display frame number, (il) a method for calcu- 
~Tating7eTererice~riWiWi~atIo^ "whTcfilFa base oTfhe ais- ' 
play time and a display frame number (for example, the 
infonnation designated by the user, characters in the 
video, sound synchronized with video, and persons in 
the video, and the importance obtained on the basis of 
the specific pattern in the video), (ill) a method for de- 
scribing both (i) and (ii). 

[0059] It is possibie to appropriately combine (1) or 
(2) and (I), (ii) or (ill). Needless to say, other methods 
can be possible. One specific combination out of such 
methods can be used, and a plurality of combinations 
of these methods may be used and can be appropriately 
selected. 

[0060] In a specific case, at the same time with the 
determination of the frame al the method (1), a relative 

valae-of't he-d isp1ay-limG-and-^lie--nurTiber-of--<ilsp[ay-- 

frames are determined, if this method is constantly 
ijsedrit^ls-possibie-to-omit-the-display-tirne-uontrol-irifor--- 

55 matlon processing unit 1 02. 

[0061] Atthe time of the special reproduction, it is as- 
sumed that the special reproduction is conducted by re- 
ferring to the display time control information 121 of (i), 
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(ii) or (iii) included in the frame information. However, 
tjie described value may be followed or the described 
value may be corrected and used. In addition to tlie de- 
scribed value and the corrected value thereof, inde- 
pendently created other information, and information In- 
put from the user may be used, Alternatively, only the 
independently created other infonnation and the infor- 
mation input from the user may be used. A plurality of 
methods out of these methods are enabled and can be 
appropriately selected, 

[0062] Next, an outline of the special reproduction v/ill 
be explained. 

[0063] A double speed reproduction (or a high speed 
reproduction) carries out reproduction in a time shorter 
than the time required for the normal reproduction of the 
original contents by reproducing a part of the frames out 
of the whole frames constituting the video data contents. 
For example, the frames Indicated by the frame Infor- 
mation are displayed for each display time Indicated by 
the display time control information 1 21 , in tlie order of 
time sequence. Based on a request from the user, such 
as a speed desig nation request for designating at what 
times speed of the normal reproduction the original con- 
tents are reproduced (in what factor of the time required 
for the nomnal reproduction the original contents are re- 
produced) and a time designation request for designat- 
ing how much time is tal<en for reproducing the contents, 
the display time of each frame (group) Is determined to 
satisfy the reproduction request. The high speed repro- 
duction is called a summarized reproduction, 
[0064] A Jump reproduction (or a jump continuous re- 
production) is such that a part of the frame shown In the 
frame inforrnaflon is subjected to npn-reproductipn, for 
example, on the basis of the reproduction/non-repro- 
duction information described later In the high speed re- 
production. The high speed reproduction is conducted 
WitlT respect to fUelfafne excludIfigllTe'tmme"Wf11chns" 
subjected to non-reproduction out of the frames shown 
in shown in the frame information. 
[0065] A trick reproduction excludes from the repro- 
duction except for the nonnal reproduction the high 
speed reproduction and the jump reproduction. For ex- 
ample, at the time of reproducing the frame shown In 
the frame infonnation, there can be considered various 
forms such as a substituted reproduction for reproduc- 
ing a certain portion by replacing the order of time se- 
quence, an Dveriapped reproduction for reproducing a 
certain portion repeatedly a plurality of times at the time 
of reproducing the frame shown In frame infomnation, a 
variable speed reproduction in which at the time of re- 
producing the frame shown in the frame information, a 
Gertain-portion-is~reproduced-at-alspeed4ower-than-the- 
Tiproductlo'n*of "ihoth^ 

which the portion is reproduced at the speed of nomial 
reproduction, or the case in which the portion is repro- 
duced at a speed lower than the normal reproduction 
time) or at a speed higher than another portion ^ or the 
reproduction of a certain portion is temporarily suspend- 



ed, or such forms of reproduction arc appropriaiely com- 
bined, a random reproduction for reproducing at a ran- 
dom time sequence for each of a constant set of frames 
shown In the frame infoiTnation. 
5 [0066] Needless lo say, It is possible to appropriately 
combine a plurality of kinds of methods. For example, 
at the time of the double speed, the important portion is 
reproduced a plurality of times, and various variations 
are considered such as a method for setting a reproduc- 
^0 tion speed to a nonnal reproduction speed. 

[0067] Hereinafter, embodiments of the present in- 
vention win be specifically explained in detail. 
[0068] In the beginning, the embodiments will be ex- 
plained by talcing as an example a case In which a re- 
is production frame is detonnlned on the basis of the 
scene change quantity between adjacent frames as the 
characteristic quantity of the video data. 
[0069] Here, there will be explained a case in which 
one frame is con-esponded lo one frame information. 
so [0070] FIG. 8 shows one example of a data structure 
of the special reproduction control infonnation created 
under the target video data. 

[0071 ] The data structure is such that the display time 
Information 1 21 is described which is infomnation show- 
ss ing an absolute or a relative display time as display time 
control infonnation 102 in F!G. 1 (or instead of the dis- 
play time control infomnation 102). A structure describ- 
ing the importance in addition to the display time control 
information 102 will be described later. 
50 [0072] The video location infonnation 101 Is informa- 
tion which enables the specification of the location In the 
original video frame of the video, and any of a frame 
nunnber (for example, a sequence number from the first 
frame) or a number which specifies one frame in a 
35 stream like a time stamp may be used. If the video data 
corresponding to the frame extracted from the original 
video siream is ser as' a sepal^inmmeTatJRIrorTlTe" 
like may be used as infonnation for specifying the file 
location. 

40 [0073] The display time infomnation 1 21 is Infonnation 
which specifies the time for displaying the video or the 
number of frames. It is possible to describe actual time 
or the number of frames as a unit and a relative value 
(for example, a norniallzed numeric value) which clari- 
45 ties a relationship of the relative time length with the dis- 
play time infonnation described in other frame informa- 
tion. In the latter case, the actual reproduction time of 
each video Is calculated from the total reproduction time 
as a whole. With respect to each video, the continuation 
50 time of the display is not described, but such description 
with a combination of a start time starting from a specific 

=4l ming»(for-exampi.e3-the-^taPt-time>of4he-fir^t-videQ-is^et 

to oyandthe'enfftime'an'd^ 

nation of the start time and the continuation time may 
55 be used. 

[0074] In the special reproduction, basically the repro- 
duction of the video present at a location specified with 
the video location information 101 only for the display 
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time specified with the display time information 121 is 
consecutively conducted only for llie number of the. 
Hems of frame information "i" included in the arrange- 
ment, sucli as shown in FIG. 8. 
[O075] If the start time and the end time or the contin- 
uation time are sp^lfled and this designation is fol- 
lowed, the video present at the location specified with 
the video location information 101 is consecutively re- 
produced from the start time specified with the display 
time information 121 up to the end time or during the 
continuation time only for the nunriber of items of the 
frame Information "i" included in the arrangement. 
[0076] The described display time can be processed 
and reproduced by using parameters such as reproduc- 
tion rata information and additional information. 
[0077] Next, a method for describing the video loca- 
tion Infonmatlon will be explained by using FIGS. 9 
through 11. 

[0078] FIG. 9 explains a method for describing the vid- 
eo loca^dn information referring to the original video 

frame. 

[0079] In FIG. 9, a time axis 200 corresponds to the 
origirtal video stream based on which the frame infor- 
mationforthe special reproduction is created and a vid- 
eo 201 corresponds to one frame which becomes a de- 
scription target in the video streann. A time axis 202 cor- 
responds to reproduction time of a video at the time of 
the special reproduction by using the video 201 extract- 
ed from the original video stream. A display time 203 is 
a section corresponding to one video 201 included in 
the display time 203. For example, the video location 
information 101 showing the location of the video 201 
and the video display time 121 showing the length of the 
display time 203 are described as frame Infomnatlon. As 
described above, the description on the location of the 
video 201 may be gi ven in any f omn suc h a s a fra me 
number, a time stamp or the lii<e as long as one frame 
in the original video stream can be specified. This frame 
infonnatlon will be described in the same manner with 
respect to the other videos 201 . 
[O080] ' FIG. 10 explains a method for describing the 
video location information refening to the image data 
tile. 

[0081] The method for describing the video location 
information shown In FIG. 9 directly refers to the frame 
In the original data frame which is to be subjected tp the 
special reproduction. The method for describing the vid- 
eo location infonnation shown in FIG. 10 is a method in 
which an image data file 300 con'esponding to a singie 
frame 302 extracted from the original video stream is 
created in a separate file, and the location thereof is de- 



Z:scrTEea:Alin©tmTo^^ 
handled In the same manner by using, for example, the 
'U.RCDnhTliK^ bQttTln:thel2as-e^ 
on a local storage device and in the case where the file 
is present on the networlc. A set of the video location 
infonnation 1 01 showing the location of this image data 
file and the video display time 121 showing the length 



of the corresponding display lime 301 Is described as 
frame infonnation. 

[0082] If a correspondence to the original video frame 
is required, the information (similar to the video location 
5 infonnation in the case of, for example, FIG. 9) showing 
a single frame 302 of the original video corresponding 
to the described frame infonnation may be included In 
the frame information. The frame information may com- 
prise the video location information, the display time in- 
fo formation and the original video Information. When the 
original \/idBo information is not required, It is not re- 
quired to describe the original video, 
[0083] The configuration of the video data described 
with the method of FIG. 10 is not particularly restricted, 
15 For example, the frame of the original video may be 
used as it Is or may be reduced, Th is is effective for con- 
ducting a reproduction processing at a high speed be- 
cause It is not required to develop the original video. 
[0084] If the original video stream is compressed by 
20. means of MPEG-1 or IVIPEG-2ortheliI<e, a reduced vid- 
eo can be created at a high speed only by partially de- 
coding the streams, In this method, only the DCT (the 
discrete cosine conversion) coefficients of an 1 picture 
frame encoded within the frame (an inner-frame encod- 
es ed frame) is decoded and a reduced video is.cceated.by 
using the DCT coefficients. 

[0085] In the description method of FIG. 1 0, the image 
data files are stored in separate files. However, these 
files may be stored in a pacloge in a video data group 
30 storage file having a video fomr^at (for example, a motion 
JPEG) which can be accessed at random. The location 
of the video data is specified by a combination of the 
URL shQwinathe location of the Image data file, aframe 
number or a time stamp showing the location in the im- 
35 age data file. The URL infonnation showing the location 
of the image data file may be described in each frame 

1 Ff orrTiMloh~orTnay 5e aBscriBed-a's"a-dditiona1 int 

tion outside of the an-angement of theframe infonmation. 
[0086] Various methods can be tal<en to select the 
40 frame of the original video or the like and create the vid- 
eo data to describe the video location InformalSon. For 
example, the video data may be extracted at an equal 
interval from the original video. Where the motion of the 
screen quite often appears, the video data is selected* 
45 in a narrow Interval. Where the motion of the screen 
quite rarely appears, the video frame is selected in a . 
wide interval. 

[0087] Here, referring to FIG, 11, there will be ex- 
plained a method In which as one example of a method 
so for selecting frames, the frame is selected in a narrow 
Interval where the motion of the screen quite often ap- 
-pears-whi!e-the-frame-is-seleGtedan.^-JWideJnle^ 



wheFe^he motToV6rtl^e 
— -[0088]— In-FIGH-lra-hoHzontal-axis-representS'the-se- 
55 lected frame riumber, and" a curve 800. represents a 
change in the scene change quantity (between adjacent 
frames). A method for calculating the scene change 
quantity is the same as a method at the time of calcu- 
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laling the display lime described later Here, in order ro 
doiermine an oxtracllon inlervaf in accordance with the 
motion of the scene, there is shown a method for calcu- 
lating an inten/al at which tiie scene change quantity be- 
tween video frames from which the video data Is extract* s 
ed becomes constant. The total of the scene change 
quantily between video frames from which the video da- 
la is extracted Is set to Sj, and the total of the scene 
change quantity in the whole frame Is set to S (= IS,) 
while the number of data items to be extracted is h. In io 
order to set the video change quantity between video 
frnmes from which video data Is extracted to a constant 
level. Sj = S/n may be provided, In FIG. 11 , the area Sj 
of the scene change quantity curve 800 divided with the 
broken lines becomes constant. Then, for example, the is 
scene change quantity Is accumulated from the extract- 
ed frame, so that the video frame having the value ex- 
ceeding the S/n is set as the frame Fj from which the 
video data is extracted. 

[0089] If the video data is created by | picture frame so 
of IVIPEG, the video frame from which the calculated vid- 
eo data is created Is not necessarily the I picture, the 
video data is created from the I picture frame in the vi- 
cinity thereof. 

[00901 By the way, in the method explained in FIG. 11 , 25 

the video, frame which belongs to the section of the 
scene change quantity = 0 is skipped. However, if a still 
picture continues, the scene is important In many cases. 
Then, If the scene change quantity = o continues for 
more than a constant time, the frame at that time may so 
be extracted. For example, the scene change quantity 
may be accumulated from the extracted frame so that 
the frame having the yajue exceeding S/n or the frame 
at which the scene change quantity = 0 continues for 
more than a constant time may be set as a frame Fifrom 35 
which the video data is extracted. The accumulated val- 

ue otThe scene"cnangequantlty may be"onrfiay noTBe 

cleared to 0. It is possible to selectively clear the accu- 
mulated value based on a request from the user. 
[0091] In the case of an example of FIG. 11 , it is as- 40 
sumedthatthe dlsplaytimeinfomiationl21 is described 
so that the display time becomes the same with respect 
to any of the frames. When the video is reproduced in 
accordance with this display time information 121, the 
scene change quantity becomes constant. The display 45 
time infomiatlon 121 may be detennlned and described 
in a separate method. 

[0092] Next, there will be explained a case in which 
one or a plurality of frames are allowed to correspond 
to one frame information. so 
[0093] One example of the data structure of the spe- 

c[al-rcpr-oduGtlonjnfoiTn^ 

thann FIG/S. " " ' 
[0094] Hereinafter, a method for describing the video 
location information will be explained by using FIGS. 1 2 ss 
through 1 4. 

[0095] FIG. 12 explains a method for descn'hlng the 
video location Inf omiation for referring to the continuous 



frames of the original video. 

[0096] A method for describing the video location in- 
formation shown in FIG. 9 refers to one frame 201 in 
one original video for conducting the special reproduc- 
tion. However, the method for describinq the video lo- 
cation infomnation shown in FIG. 12 describes a set 500 
of a plurality of continuous frames In the original video. 
The set 500 of frames may include some frames extract- 
ed from the plural continuous frames within Ihe original 
video. The set 500 of frames may include only one 
frame. 

[0097] If the set 500 of frames includes a plurality of 
continuous frames or one frame In the original video, the 
location of the start frame and the location of the end 
frame are described, or the location of the start frame 
and the continuation time of the set 500 are described 
in the description of the frame location (if one frame Is 
included, for example, the start frame Is set equal to the 
end frame). In the description of the location and the 
time, the frame number and the time stamp and the Ijke 
are used which can specify frames in the streams. 
[00981 Ifthe set 500 of frames is a part out of a plurality 
of continuous frames In the original video, infonnation 
Is described which enables the specification of tho 
frames. If the method for extracting the frames is deter- 
mined, and the specification of the frames can b© spec- 
ified with the description of the locations of the start 
frame and the end frame, the start frame or the end 
frame may be described. 

[0099] Tlie display time Information 501 shows the to- 
tal display time corresponding to the whole frame group 
included in the corresponding frame set 500. The dis- 
play time pf each frame included in the set 500 of frames 
can be appropriately detemnined on the side of device 
for the special reproduction. As a simple method, there 
is a vailable a method in which the above total display 
"Ime IS equally^lv]a"ed"w^^ total number orframes"" 
in the set 500 to provide one frame display time. Various 
other methods are available. 

[0100] FIG. 13 explains a method for describing video 
location Infonnation for referring to a set of the image 
data files. 

[0101] The method lor describing the video location 
infonnation shown in FIG, 12 directly refers to continu- 
ous frames in the original video to be reproduced. A 
method for describing the video location infonnation 
shown in FIG. 13 creates a set 600 of the image data 
flies con-esponding to the original video frame set 602 
extracted from the original video stream in a separate 
file and describes the location thereof. In the method for 
describing the file location, the file can be handled in the 
J.?fl?^r^.l®t^y'l'.sj,ng,7^^^^ Rl^-o^ the-Iike, 

eveiii if the file is present on a local storage devicec>r if~~ 
the file is present on a networic. A set oif the video loca- 
tion information 101 showing the location of this image 
data file and the video display time 121 showing a length 
of the corresponding display time 601 can be descrbed 
as the frame infomiation. 
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[01 02] If a correspondence wilh the origlnat frame is 
required, information showing the frame set 602 ofthe 
original video corresponding to the described frame in- 
formation (for example, information similar to the video 
location infonnatlon in the case of FIG. 12) may be in- 
cluded in the frame infonnation. The frame information 
may comprise the video location information , the display 
time information and the original video Infomnation.The 
original video infonriation Is not required to be described 
wlienthe Infonnation is not required. 
[0103] The configuration of the video data, the prep- 
aration ofthe video data, the preparation of the reduced 
video, the method for storing the video data and the 
method for describing the location Information such as 
the URL or the lll<e are the same as what has been de- 
scribed above. 

[0104] Various methods can be adopted In th^ same 
manner as described above as to which frame of the 
original video is selected to create the video data to be 
described In the video location Information, For exam- 
ple, the video data may be extracted at an equal inteival 
from the original video. Where a motion of the screen 
quite often appears, a frame is extracted In a narrow in- 
terval. Where the motion of the screen rarely appears, 
a frame is extracted in a wide Interval. 
[0105] In the above embodiments, the image data file 
300 is corresponded to the original video 302 in a frame 
to frame manner. It Is po$sible to make the location in- 
fonnation of the frame described as the original video 
information have a time width. 
[0106] FIG. 14showsanexampleinwhichtheoriginal 
video information is allowed to have a time width with 
respecttotha FIG. B. An original video Information 3701 
is added to the frame infomiation structure shown in 
FIG. B. The original video infomiation 3701 comprises 
a st art point Information 3702 and a section le ngth In- 



f ormat[Dn 3703 which are the start point and the section ' 
length ofthe original video which is atarget of the special 
reproduction. The original video information 3701 com- 
prises any infonnation which can specify the section of 
the original video having the time width. It may comprise 
the start: point information and an end point Information 
In stead of the start point infonnation and the length in- 
formation. 

[01 07] FIG. 1 5 shows an example in which the original 
video Information is allowed to have a time width with 
respect to the FIG. 9, Jn this case, for example, as video 
location infonnation, display time Infonnation and origi- 
nal video infonnation included In the same frame infor- 
mation, the location of the original video frame 380 1 , the 
display time 3B02, and the original video frame section 

Z3S'Q^'whlc!rcgmprj6^:sn^^ 
and the section length are described to show that these 

■■-liOTreSpondtonBach-pth-er 
ative of the original video frame section 3803, the orig- 
inal video frame location 3801 described in the video 
location information is displayed. 
[01 OB] FIG. 16 shows an example in which the original 



information is allowed 1o have a tinne width with respect 
to the FIG. 1 0. In this case, for example, as video loca- 
tion infomiation, display lime Information and original 
video information included in the same frame intorma- 

5 tion, the location of the image data file 3901 for the dis- 
play, the display time 3902, and the original video frame 
section 3903 which comprises the start point (frame lo- 
cation) and the section length are described to show that 
these correspond to each other. 

10 [0109] That is, as a video representative ofthe original 
video frame section 3903, the image 3901 in the Image 
data file described in the video location Infonnation is 
displayed. 

[0110] Furi:hennore, as shown in FIGS. 12 and 13, If 
15 a set of frames Is used as a video for the display, a sec- 
tion different from the original video frame section for 
displaying the video may be allowed to correspond to 
the original video Infonnation. 

[0111] FIG. 17 shows an example In which the original 
20 video information is allowed to have a time width with 
respect to the FIG. 12. In this case, for example, as video 
location infonnation, display time infonnation and origi- 
nal video information Included in the same frame infor- 
mation, a set 4001 of frames in the original video, the 
25 display time 4002, and the original video frame section 
4003 which comprises the start point (frame location) 
and the section length are described to show that these 
correspond to each other. 

[0112] Atthls time, the section 4001 of a set of frames 
30 which are described as video location Information, and 
the original video frame section 4003 which is described 
as the original video information are not necessarily re- 
quired tP coincide with each other and a differentsectlon 
may be used for display. 
35 [0113] FIG. 18 shows an example in which the original 
video infonnation is allowed to have a time width with 
fespBcffortTTeFlG7l3TWhiBi^^^^^^ 



location information, display time information and origl 
nal video information included in the same frame infor- 
40 mation, a set 41 01 of frames in the video file, the display 
time 4102, and the original video frame section 4103 
which comprises the start point (frame location) and the 
section length are described to show that these corre- 
spond to each other. 
45 [0114] Atthistlme.the section ofa set 41 01 offrames 
described as video location Infonnation, and the original 
video frame section 41 03 described as the original video 
are not necessarily required to coincide with each other. 
That Is, the section of the set 41 01 of the frames for the 
50 display may be short:er or longer than the original video 
frame section 4103. Furthermore, a video having com- 
plQtely-different>contents-may.bejncludedJ:herj2lnJjxadr_ 

dition, only ^Particularly impottaritTeclion 'n^ 

tracted-from-the-section-described-rin-the-orlginal-A/ideo- 

55 location as the image data file so thal collected video 
data is used. 

[01 15] At the time of displaying the videos based on, 
for example, the summarized reproduction (special re- 
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produclion) using these iiems of Ihe frame informallon, 
it may be desired thai the corresponding frame in the 
original video is referred lo. 

[01 16] FIG, 1 9 shows a flow for starting the reproduc- 
tion from the frame of the original video con-esponding 
to the video frame displayed in special reproduction. At 
step S3601, the reproduction start frame is specified in 
the special reproduction. At step S3602, the original vid- 
eo frame corresponding to the specified frame is calcu- 
lated with a method described later. At step S3603, the 
original video is reproduced from the calculated frames. 
[01 17] This flow can be used for referring to the cor- 
responding location of the original video in addition to 
special reproduction. 

[0118] At step S3602, as one example of a method 
for calculating the corresponding original video frame, 
there Is shown a method for using the proportional dis- 
tribution with respect to display time of the speclfled 
frame. The display lime infonnalion included in the i-ih 
frame information Is set to D| sec, the section start loca- 
tion of the original video infonnation is set to t] sec, and 
the section length is set to dj sec. If the location is spec- 
ified at which t sec has passed from the start of the re- 
production using the i-th frame information, the frame 
location of the con"espondlng original video Is T « t h- d 25 
X t/Dj. 

[01 19] Refemng to FIGS. 20 and 21 , as examples of 
a method for selecting a frame, there will be explained 
a method for extracting the frame in a narrow interval 
where the motion of the screen quite often appears while 
extracting the frame in a wide interval where the motion 
of the screen rarely appears in accordance with the mo- 
tion of the screen. The horizontal axis, the curve 800, 
and S| and Fj are the same as those in FIG. 11 . 
[0120] In the example of FIG. 11, the video data is ex- 
tracted one frame after another at an interval at which 
th^B-sceTO-ch-ange liraTitity-t-etwe-en^e-^ramrs^^ . 
which the video data is extracted is made constant 
-FIGS. 20' and 21 show examples In which a set of a plu- 
rality of frames are extracted based on the frame f ^^ as 
reference. For example, as shown in FIG. 20, the same 
number of continuous frames may be extracted from Fj. 
The frame length 811 and the frame length 812 equal to 
each other. As shown in FIG. 21, the corresponding 
number of continuous frames may be extracted so that 
the total of the scene change quantity from Fj becomes 
constant. The area 813 and ihe area 814 equal to each 
other. Various other methods can be considered. 
[01 21] It is possible to use the frame selection method 
in which the frame is extracted when the scene change 
quantity = 0 continues for more than a constant time. 
[0122]-^sJnJhe^ase-of.FIG,J4^he-dispJay4ime4n- 
format ion ISrrhay be" 'desc'ribed sdl¥atlhe same*dis'~ 
playtime may be provided with respect to any of frame 
sets in the cases of FIGS. 20 and 21. Altematively, the 
display time infomnation may be determined and de- 
scribed in a different method, 
[0123]. Next, one example of a processing for calcu- 



30 



35 



40 



45 



50 



laling the display lime will be explained. 
[0124] FIG, 22 shows one example of a procedure of 
the basic processing for calculating the display time so 
that the scene change quantity becomes constant as 
much as possible when the video described in the video 
location information is continuously reproduced in ac- 
cordance with time described In the display time infor- 
mation. 

[0125] This processing can be applied to a case in 
which the frames are extracted in any method. For ex- 
ample, if the frames are extracted in a method shown in 
FIG. 11, the processing can be omitted. Since the 
processing shown in FIG, 11 selects the frames such 
that the scene change quantity becomes constant when 
the frames are displayed for a fixed time period. 
[0126] At step S71, the scene change quantity be- 
tween ad|acent frames Is calculated with respect to all 
frames of the original video. If each frame of the video 
is represented in bit map, the differential value of the 
pixel between adjacent frames can be set to the scene 
change quantity. If the video is compressed with MPEG, 
the scene change quantity can be calculated by using 
a motion vector 

[0127] One example of a method for calculating the 
scene change quantity will be explained, 
[0128] FIG, 23 shows one example of a basic 
processing procedure for calculating a scene change 
quantity of all frames from the video streams com- 
pressed with MPEG. 

[01 29] At step S81 , a motion vector Is extracted from 
the P picture frame. The video frame compressed with 
the MPEG is described with an arrangement of I picture 
(an inner-frame encoded frame), P picture {an Inter- 
frame encoded frame in a fonvard prediction), and B pic- 
ture (an inter-frame encoded frame in a backward pre- 
diction), as shown In FIG. 24. The P picture includes a 

ceding I picture or P picture, ^ . 

[0130] At step 882, the magnitude (intensity) of the 
each motion vector Included in the frame of one.P pic- 
ture Is calculated, and an average thereof Is set as a 
scene change quantity from the preceding I picture or P 
picture. 

[0131] At step 883, on the basis of the scene change 
quantity calculated with respect to the P picture, the 
scene change quantity Is calculated for each one frame 
corresponding to the frame otherthan theP picture. For 
example, if the average value of the motion vector of the 
P picture frame is p, and the Interval from the preceding 
i picture or P picture from which the video is referred to 
is d, the scene change quantity per one frame of each 
-frame-.ls^et4o-p/d.^ 



55 



[01 32] Subse"quenfly at step S72Tn"thTpFocedur^^^ 
RG. 22. the total of the scene change quantity of frames 
between the following description target frames is cal- ■ 
culated from the description target frame described In 
the video location information. 
[0133] RG. 25 describes a change In the scene 
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change quantity for each pne fr^me. The hori7ontal axis 
corresponds to the trame number while a cmve 1000 
denotes a change In the scene change quantity. If the 
display time of the video having the location information 
of the frame infonnation F; Is calculated, the scene 
change quantity in the section 1 001 up to Fj^^ is added 
which conresponds to the frame location of the next de- 
scription target frame. It is considered that this becomes 
an area S, of the hatching portion 1 002, which is a mag- 
nitude of a motion of the frame location F,. 
[01 341 Subsequently, at step S73 in the procedu re of 
FIG. 22, the display lime of each frame is calculated. In 
order to set the scene change quantity to a constant lev- 
el as much as possible, a larger quantity of the display 
time may only be allocated to the frame where the mo- 
tion of the screen Is large, sothatthe ratio of the display 
time allocated to the video of each frame location Fj to 
.the reproduction time may be set to Sj/SSj. When the 
lolal of the reproduction lime Is set to T, the display time 
of each video will be set to Dj = T x S/TSi The value of 
the total T of the reproduction time is defined as the total 
reproduction time of the original video. 
[0135] If no scene change appears and Sj= 0, the low- 
er limit value (for example, 1) which is calculated in ad- 
vance may be entered, or the frame information thereof 
may not be described. Even with respect to the frame 
where the screen change is very small even if S; = 0 is 
not provided and virtually no change Is displayed on the- 
actual reproduction, the lower limit value may be substi- 
tuted and no frame infomnatlon may be described, if no 
frame infonnation is described, the value of Sj may be 
added to S^^ or may not be added thereto. 
[0136] The processing for calculating this display time 
can be conducted for the preparation of the frame infor- 
mation with the special reproduction control information 
_ creating apparatus, but the processing c an be con duct- 
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ed at the time of the special "mproduction on the side of 
the video reproduction apparatus. 
[0137] Next, there will be e>qDiained a case in which 
the special reproduction is conducted. 
[0138] FIG. 26 shows one example for the N times 
high-speed reproduction on the basis of the special re- 
production control Information that has been described. 
[0139] At step S111 , the display time D\ at the time of 
reproduction Is calculated on the basis of the reproduc- 
tion rate Information. The display time information de- 
scribed in the trame infonnation is standard display lime, 
the display time D\ = D/N of each frame is calculated 
when reproduction at N times high-speed is conducted. 
[0140] At step S112, initialization for the display Is 
conducted, and i = 0 is set so that the first frame infor- 



..matiofus.flr§&lay#?. 

[0141] Atstep S113, it is determined whether the dis- 

•■p1gy--timrDTcjfth-e-l-th-framTinfonT)a^^^ 
the threshold value of the preset display time. 
[01 42] if the display time Is larger, the video location 
infonmation included in the l-th frame infonnation Fj is 
displayed for D', seconds at step S114. 



[0143] If the display time is not larger, the process pro- 
ceeds to step S115 to search the i-th frame infomr^ation 
which is not smallerthan the threshold value in aforward 
direction. During search, the display time of the frame 
infonnation which Is smaller than the threshold value of 
the display time Is all added to the display time of the l- 
th frame infomnatlon. Tine display time of the frame in- 
formation which is smallerthan the threshold value of 
the display time is set to 0, The reason why such 
processing is conducted is that the time for preparing 
the video to be displayed becomes longer than the dis- 
play time when the display time at the time of reproduc- 
tion becomes very short with the result that the display 
cannot be conducted in time. Then, if the display time 
becomes very short, the process proceeds to the next 
step without displaying the video. At that time, this dis- 
play time of the video which Is not displayed is added to 
the display time of the video to be displayed so that the 
tola! display time becomes unchanged. 
[0144] At step S1 16, it is determined whether 1" is 
smallerthan the total number of the frame information 
items in onderto detemnlne whether or not the frame in- 
formation which is not displayed remains, If "i" is lower 
than the total number of the frame infonnation Items, the 
process proceeds to step S1 17 to increment "li' by one 
to create for the display of the next freimB information. 
When T reaches the total number of the frame infonna- 
tion items, the reproduction processing is completed. 
[01 45] FIG. 27 shows one example for conducting the 
N times high-speed reproduction on the basis of the de- 
scribed special reproduction control Infomnatlon by tak- 
ing the display cycle as a reference. 
[01 46] At step S121 , the display time D\ of each frame 
is calculated as D*^ := D/ N at the N times high-speed 
reproduction. Here, the calculated display time is actu- 
ally associated with the display cycle so that the video 

cannoTbe iilways digptayexHrraTsalcalated-lime: 

[0147] FIG, 2B shows a relationship between the cal- 
culated display time and the display cycle. The tinie axis 
40 1300 shows the calculated display time while the time 
axis 1301 shows the display cycle based on the display 
rate. If the display rate is f frame/sec, an interval of the 
display cycle becomes 1/f sec. 
[0148] Consequently, at step SI 22, the frame infor- 
ms mation F, Including the start point of the display cycle Is 
searched while the video Included In the frame Infonna- 
lion Fj is displayed lor one display cycle (1/f sec) atstep 
S123. 

[0149] For example, the display cycle 1302 (FIG. 28) 
50 displays the video of the frame information correspond- 
ing to this display time because the display start point 

^lS08-is-included-in4he-calculatGd.displayJimeJ3M 
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[01 50] " A'method f or"al lb wi ng the* display cycle corre-"" 

spend to-the-frame-informatiGn-rnay-dispiaythe-video-at— 

55 the nearest location of the start point of the display cycle, 
as shown in FIG. 29. tf the display time becomes smaller 
than the display cycle like the display time 1305 of FIG. 
28, the display of the video may be omitted. If the video 
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is forcibly displayed, the display lime before and afTer 
Ihe video is shortened to adjust so tfiatthe total display 
lime becomes unchanged. 

[0151] At step S124, it is determined whether the cur- 
rent display is the final display or not. If the current dis- 
play is the final display, the processing is completed. If 
the display Is not ihe final display, the process proceeds 
to step S125 to conduct the processing of the next dis- 
play cycle. 

[0152] FIG. 30 shows another example of a data 
structure for describing the frame Information. The 
frame information included in the data structure of FIG. 
8 or FIG. 14 summarizes a single original video. A plu- 
rality of original videos can be summarized by expand- 
ing the frame infomnation, FIG. 30 shows such an ex- 
ample. An original video location infonnatlon 4202 for 
indicating the Driginal video Tile location Is added to the 
original video Information 4201 included In the Individual 
frame infonnation. The file described in the original vid- 
eo location information 4202 is not necessarily required 
to handle the entire file. The file can be used in the fonn 
In whteh only a pori:ion of the section is extracted. I n this 
case, not only file information such as a file name or the 
like but also the section Infomnation showing which sec- 
tion of the file becomes an object are additionally de- 
scribed. Plural sections may be selected from the orig- 
inal video. 

[0153] Furthermore, it several kinds of the original vid- 
eos are present and identification Information is individ- 
ually addedtp the videos, the original video Identification 
infonnation may be describedin place of the original vid- 
eo location information, 

[0154] FIG. 31 explalrts an example in whjch a pluraj- 
lly of original videos are summarized and displayed by 
using the frame infonnation added with the original vid- 
eo l ocation i nfonnation. In this example, three videos 
"afesummanZedla display one sufr^ma^zedviQeo.l^?itI^" 
respecttothe video 2, in place of the whole section, two 
sections 4301 and 4302 are taken out to handle the re- 
spective videos. As the frame infonnatton, together with 
these original video infomiation, the frame location 
(4303 with respectto 4301) of respective representative 
video is described as the video location information 
whilethe display time (4304 with respect to 4301) is de- 
scribed as the display time Information. 
[0155] FIG. 32 explains another example In which a 
plurality of original videos are summarized and dis- 
played by using the frame infomiation added with the 
original video location Infonnation, In this example, 
three videos are summarized to display one summa- 
rized video. With respect to the video 2, In place of the 



piiiriality of se"ctions may be taken~ouFas describ^^^^^ 
FIG. 31 . As the frame Infonnatlon, together with these 
items of the original video infonnation (for example, the 
section infonnation 4401 in addition to the video 2)! the 
storage location of respective representative video files 
4402 is described as the video location information and 



rhe display time 4403 is described as display lime infor- 
mation. 

[01 56] Addition of the original video location informa- 
tion to the frame information whbh has been explained 
5 in these examples can be applied completely in the 
same way to the case in which a set of frames is used 
as video location information with the result that a plu- 
rality of original videos are summarized and displayed. 
[0157] FIG. 33 shows another data stiucture for de- 
^0 scribing the frame infonnation. In this data structure, in 
addition to the video location information 101, the dis- 
play time information 121 and the original video infor- 
mation 3701 which has been already explained, a mo- 
tion infonnation 4501 and interest region information 
'5 4502 are added. The motion infonnation 4501 describes 
a magnitude of a motion (a scene change quantity) in a 
section (the section described In the original video Infor- 
mation) of the original video corresponding to the frame 
information. The Interest region Information 4502 refers 
so to a description of the infoimation which should be par- 
ticulariy inierested in the video which is described in the 
video location infonnation, 

[01 58] The motion information can be used for calcu- 
lating the display timo of the video described In the video 
^5 location information as used at the time of calculating 
the display time from the motion of the video, as shown 
in FIG. 22. In this case, even when the display time in- 
formation Is omitted and only the motion information is 
described, special reproduction such as high-speed re- 
30 production can be conducted In the same manner as In 
the case in which the display time Is described. In this 
pase. the display time is calculated at the time of repro- 
duction. 

[0159] Both the display time infonnation and the mo- 
55 tion infonnation can be described at the same time, in 

^hat case, an application for displaying uses the required 

oriFWWtw67 onuses "56TH IffcomBin at i on m a ccofd "" 
ance with the processing, 

[0160] For example, the display time calculated in^- 
^0 spectlve of the motion is described in the display time 
infonnation. A method for calculating the display time 
for cutting out important scenes from the original video 
con-esponds to this. At the time of the high-speed repro- 
duction of the summarized contents calculated in this 
45 manner, the motion inf onnatio n Is used so that a portion 
with a large motion is reproduced slowly while a portion 
with a small motion is reproduced quickly with the result 
that a high-speed reproduction free from a large over- 
look is enabled. 
so [0161] The interest region infonnation is used when 
the particularly interest region Is present in the video de- 



formation. For example, faces of persons whcTseernlo" 
be Important con-espond to this. At the time of displaying 
55 the video including such interest region Infonnation. the 
display may be conducted by overlapping a square 
frame so that the interest region cap be easily detected. 
The frame display Is not Indispensable, and the video 
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may only be displayed as II Is, 
[01 62] The interest region information can be used for 
processing and displaying the special reproduction con- 
trol Inlonnation such as franne information or the like. 
For example, if a part of the frame infonnation is repro- 
duced aiid displayed, the frame information including 
the interest region information is displayed with pnority. 
Further, it is assumed that the frame Information includ- 
ing square area with large area has higher importance, 
thereby making it possible to selectively displaying he 
video. 

[0163] As shown above, there has been explained an 

example in w/hich the processing is conducted on the 
basis of the scene change quantity. Hereinafter, there 
will be explained a case In which the importance infor- 
mation is used. 

[01 64] FIG. 34 IS a view showing examples of a data 
structure of the frame Information attached to the video. 
[01 65] An importance infornialjon 122 is described in 
addition to or in place of the display time control infor- 
mation 1.02 In the data structure of the frame Information 
of FIG. 1 . The display time is calculated based on the 
importance information 122. 

[01 B6] The importance infonr^ation 122 represents 
the Importance of the corresponding frame (or a set of 
frames). The importance is represented, for example, 
as an Integer in a constant range (for example, 0 to 100), 
or Is represented as an actual number in a constant 
range (for example, 0 to 1). Otherwise, the importance 
inf onnation 122 may be represented as an integer or an 
actual number value without setting the upper limit. The 
importance infomnation 122 may be attached to all the 
frames of the video, or only the frame in which the im- 
portance Is changed. 

[0167] In this case as well, it is possible to take any 
form of FIGS. 9. 10. 12 , andJ^3.J7ieJrame^^^^ 



method of FIGS." 11, 20^ and 21 can be used. In "this 
case, the scene change quantity of FIGS. 11 , 20, and 
21 may be repiaced by the importance. 
[0168] Next, in the example which has been explained 
above, the display time is set with the scene change 
quantity. However, the display time may be set by the 
importance Infonnation. Hereinafter, the methodforset- 
ting the display time will be explained. 
[0169] In the setting the display time on the basis of 
the scene change quantity exemplified above in order 
to understand the video conlenls well, the display lime 
is set long where the change quantity is large and the 
display time is set short where the change quantity is 
small. In the setting of the display time on the basis of 
this importance, the display time is set long where the 
Zljrrippjtanper^^ 

where the importance is low. That is, since the method 
'rrforrs'ettlh'S'th^'displaytimiB-accordingtq 

is basically similar to the method for setting the display 
time based on the scene change quantity, the method 
will be briefly explained. 

[O170] FIG. 35 shows one example of- the basic 



processing procedure In this case. 
[0171] At Step SI 91 , the impoitance of all frames of 
the original video will be calculated. A concrete method 
thereof will be exemplified later. 
5 [0172] At Step SI 92, the total of the importance from 
the description object frame described in the video lo- 
cation infomnation to the next description object frame 
will be calculated. 

[0173] FIG, 37 describes the change in the impor- 
10 tance for each one frame. Reference numeral 2200 de- 
notes the importance. If the display time of the video 
having the location information of the frame Information 
Fj is calculated, the importance in the section up to F]+i 
which Is the next description object frame location is ac- 
15 cumulated. The accumulation result is an area S\ of the 
hatching portion 2202. 

[0174] At step S1 93, the display time of each frame is 
■ calculated, Suppose that the ratio of the display time al- 
located to the video at each frame location F( the repro- 
ve duction time is set to S'/SS j. When the total of the re- 
production time is set to T, the display time of each video 
becomes Dj = T x S'/S'j. The value of the total T of the 
reproduction time is a standard reproduction time to be 
regulated as the total reproduction time of the original 
25 video. ■ ' 

[0175] When the total of the importance becomes S', 
= 0, the preset lower limit value (for example, 1 ) may be 
described, or the frame information may not be de- 
scribed. Even if S*, = 0 is not established but the Impor- 
30 tance is very small, and it Is assumed that such a frame 
is virtually not displayed, the lower limit value may be 
described or the frame information may not be de- 
scribed. If the frame Information is not described, the S'^ 
value may be added and may not be added to S'j^v 
35 [0'176] As shown in FIG. 34, in the data structure of 
the frame information of FIG. 1 , the video location infor- 

mation'i C Trtlie'aisplay Time i nf owati'onl 21"and theim^ 

portance infonnation 112 may be described in each 
frame information "i". At the time of the special repro- 
40 duction, the display time infonnation 121 is used but the 
importance Infonnation 122 is not used; the importance 
information 122 is used but the display time infonnation 
121 is not used; both the importance information 122 
and the display time information 121 are used; and nel- 
45 therthe Importance infomnatlon 122 northe display time 
information 121 Is used. 

[0177] The processing of calculating the display time 
can be conducted for preparing the frame information 
with the special reproduction control information creat- 
50 Ing apparatus. However, the processing may be con- 
ducted on the side of the video reproduction apparatus 
at-the-time-of4he' special-reproduction 



^^^[01 78] " "N e xtTa m eth od" (f o r ex "aim p]i\ st ip" S 1 § 1" of F IGT " 

-Qej-jor-calculating-the-importanee-of-eaGh-frame-oF-the- 

55 scene (video frame section) .will be explained. 

[0179] Since various factors are nomnally intertwined 
in the judgment a3 to a certain scene having a video is 
important, the most appropriate method tor calculating 
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the importance Is a method In whinh man determines 
Ihe rmportancG. In Ihis method, importance evaluator 
evaluates ihe Importance for each scene of the video, 
or for each of the constant interval, so that the impor- 
tance is input as the importance data. The importance 
data refen-ed to here refer to a frame number or lime 
and a con-espondence table with the importance value. 
In order to avoid subjective evaluation of importance, a 
plurality of Importance evalualors are allowed to evalu- 
ate the same video to calculale the average value (or a 
median or the like will do) for each scene or each video 
frame section so that the importance Is finally deter- 
mined. In such manual Input of the importance data, it 
is possible to add vague expressions and a plurality of 
elements which cannot be expressed in words to the im- 
portance. 

[01 80] In order to omit the trouble of detenmlnation by 
man, it Is preferable that a phenomenon is expected In 
which a video scene which seems lobe Important is like- 
ly to appear, and the processing is used which automat- 
ically evaluates such phenomenon to convert the phe- 
nomenon into importance. Here, some examples are 
shown in which importance is automatically created. 
[0181] FIG. 38 shows an example of a processing pro- 
cedure' at the time of automatically calculating Important 
data on the basis of the idea that a scene having a large 
sound level is important. FIG. 38 is established as a 
functbn block diagram. 

[0182] In the sound level calculation processing at 
step S21 0, the sound level at each time is calculated out 
whenthesound level attached to the video Is calculated. 
Since the sound level largely changes In an Instant, the 
smoothing processing prthe liKe may t>B conducted in 
the sound level calculation processing at step S21 0. 
[0183] In the Importance calculation processing at 
step S211 , a processing is conducted for converting into 
'ThrirnponancetTirsoufid leverbutpuTas a"resulf "o^^ 
sound level calculation processing. For example, the 
sound level input is linearly converted into a value of 0 
to 100, the sound level having the lowest sound level 
set in advance being set to 0, and having the highest 
sound level being set to 100. The sound level not more 
than the lowest sound level is set to 0 while the sound 
level not less than the highest sound level is set to 1 00. 
As a result ot the Importance calculation processing, the 
importance at each time Is calculated to be output as 
importance data. 

[01 84] FIG. 39 shows an example of a processing pro- 
cedure of a method for automatically calculating another 
importance level. FIG. 39 is established as a function 
block diagram. 

d^^^^} T:* ":P[f5?ssjng<)f>-Fia^ Jthat 

the scene in which important words reglSered Tn"ad^^ 

. vance In the sound attached to the video quite often ap- 
pear is important. 55 
[0186] In the sound recognition processing at step 
S220. when the sound data attached to the video is in- 
put, the language (words) man talks is converted into 
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text data in the sound recognition processing, 
[0187] In the important word dictionary 221. words 
which are likely to appear in important scenes are reg- 
istered. If the degree of importance of registered words 
differs, the weight is added to each of the registered 
words. 

[01 88] In the word collation processing at step S222, 
the text data which is an output of the sound recognition 
processing is collated with the words registered in the 
important word dictionary 221 to determine whether or 
not important words are talked. 
[0189] In the importance calculation processing at 
step S223, the importance in each scene of the video 
or at each time Is calculated from the result of the word 
collation processing. In this calculation, the number of 
the appearances of Important words and the weight of 
the important words are used so that the processing is 
conducted to increase the importance around the time 
at which, lor example, important words have appeared 
(or of the scene in which the important words have ap- 
peared) by a constant value, or a value proportional to 
the weight of the important words. As a result of the im- 
portant calculation processing, the importance at each 
time Is calculated to be output as importance data. 
[0190] If the weight of all the words is set to the same, 
the important word dictionary 221 becomes unneces- 
sary. This is because that it is assumed that the scene 
in which many words are spoken is important At this 
time, in the word coilation processing at step S222, the 
processing of countingthe number of words outputfrom 
the sound recognition processing is conducted. Not only 
the number of words but also the number of characters 
may be counted. 

[01 91 ] FIG . 40 shows an example of a processing pro- 
cedure of the method for automatically calculating the 
other importance level. FIG. 40 Is also established as a 
"7uhc!i on'^rdcR'STagram: 
[0192] The processing of FIG. 40 detemnlnes that the 
scene in which many important words appear which are 
registered in advance in the telop appearing in the video 
Is important 

[0193] In the .telop recognition processing at step 
S230, the character location in the video is specified to 
recognize characters by converting the video region at 
the character location Into a binary value. The recog- 
nized result Is output as text data. 
[0194] The important word dictionary 231 is the same 
as the important word dictionai^ 221 of FIG. 39. 
[0195] In the word collation processing at step S232, 
in the same manner as at step S222 in the procedure of 
FIG. 39. the text data which is an output of the telop 
li!^??3D.?'?n^^?5l®^?'"9'^i'^® ^'3^®^ with the-words^egis^ 
tered in the important word dictjonaiy 23Tto detem^^ 
whether or not important words have appeared. 
[0196] In the importance calculation processing at 
step S232, the importance at each scene or at each time 
is calculated from the number of appearances of impor- 
tant words, and weight of the important words in the 
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same manner as at step S223 In the procedure of FIG. 
39. As a result of the importance calculation processing, 
the importance at each time is determined to be output 
as Importance data. 

[01 97] If the weight of ail the words Is set to the same, 
the Important word dictionary 231 becomes unneces- 
saiy. This Is because that It is assumed that the scene 
In which many important words appear Is an important 
scene. At this time, in the word collation processing at 
step S232» processing is conducted for counting the 
number of words simply output from the telop recogni- 
tion processing. Not only the number of words but also 
the number of characters may be counted. 
[0198] FIG. 41 shows an example of a processing pro- 
cedure of a method for automatically calculating still an- 
other importance level. FIG. 41 is established as afunc- 
tion b)ock diagram. 

[0199] The processing of FIG. 41 determines that 
when the telop appearing In the video Is In larger char- 
acter size, the scene is more Important 
[0200] In the telop detection processing at step S240, 
the processing Is conducted for specifying the location 
of character string In the video. 
[0201] In the character size calculation processing at 
step S241 , individual characters are extracted to calcu- 
late the average value or the maximum value of the size 
(area) of the ctiaracter, 

[0202] In the Importance calculation processing at 
step S242, the Importance is calculated which is propor- 
tional to the size of the character which is an output of 
the character size calculation processing. If the calcu- 
lated importance is too large-ortoo small, the processing 
Is conducted for restricting the importance to a preset 
range with the threshold value processing. As a result 
of the importance calculation processing, the impor- 
__tanceat each time is calculat ed to b e output a s Impor- 
tartce data. 

[0203] FIG. 42 shows sn example of the processing 
procedure of a miBthod for automatically calculating still 
another importance level. FIG. 42 is established as a 
function block diagram. 

[0204] The processing of FIG. 42 determines that the 
scene in which human faces appear in the video is im- 
portant. 

[0205] In the face detection processing at step 5250, 
the processing is conducted for detecting an area which 
loolcs like a human face in the video. As a result of the 
processing, the number of areas (number of faces) 
which are determined to be a human face is output. The 
infomiation on the size (area) of the face may be output 
at the same time. 
"ZI[02DjB]"Znh3hBjmpQl1anc6r^ 

step S251 , the number of faces which is an output of the 
ISj^ocesslngrsf cistfe^^^^^^ 

times to calculate the importance. If the output of the 
face detection processing Includes face size infpnma- 
tion, calculation is conducted so that the importance in- 
creases with an Increase in the size effaces. For exam- 
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pie, the area of the face Is rnultiplied by several limes lo 
calculate the importance, As a result of the importance 
calculation processing, the importance at each time is 
calculated to be output as importance data. 
[0207] FIG. 43 shows an example of the processing 
procedure of a method for automatically calculating still 
other importance level. FIG. 43 is also established as a 
function block diagrann, 

[0208] In the processing of FIG. 43, It Is determined 
that th6 scene in which a video similar to the video which 
is registered in advance appears is important. 
[0209] The video which should be detennined to be 
Important is registered in the important scene dictionary 
260. The video is recorded as raw data or Is recorded 
in a data compressed form. Instead of the video Itself, 
the characteristic quantity (a color histogram , a frequen- 
cy or the like) of the video may be recorded. 
[0210] In The similarity/non-similarity calculation 
processing at slep S251 , slmnarity/non-simllariLy be- 
tween the video registered in the Important scene dic- 
tionary 260 and the input video data is calculated. As 
the non-similarity, the total of the square error orthe total 
of the difference in the absolute value is used. If the vid- 
eo data is recorded in the important scene dictionary 
260, the total of the square error for each of tha.corre- 
sponding pixels and the total of the differential of the ab- 
solute valued are calculated as non-similartty. If the 
color histogram of the video Is recorded in the Important 
scene dictionary 260. the same color histogram is cal- 
culated with respect to the Input video data to calculate 
the total of the square en-or between histograms and the 
total of the difference in the absolute values to set these 
totals as npn-slmilar!ty. 

[0211] In the importance calculation processing at a 
step S262, the importance is calculated from the simi- 
larity/ non-similarity which is an output of the similarity 
■"and" h'bh'^imilai1fy"calcTnatTorrpror^^ 
tance is calculated in such a mannerthat larger similarity 
provides greater importance if the similarity is input 
while larger non-similarity provides smaller importance 
if the non-similarity is input. As a result of the importance 
calcuiatlon processing, the importance at each time is 
calculated to be output as the importance data. 
[0212] Furthemnore, as another method for automat- 
ically calculating the Importance, the scene having a 
high instant viewing rate is set as an important scene. 
The data on the instant viewing rate is obtained as a 
result of the summing of the viewing rate investigation, 
so that importance is calculated by multiplying the in- 
stant viewing rate by constant times. Needless to say, 
there are various other methods, 

[0213] — TheimportancB-calGulationprdcessinginayJsfi^ 

soiefy conducted", 'or a "plurality "of Hata' lt ems "rna^ 

u&ed-at-the-same-time~tQ-calGUlate-the-.in^portanGe^ln 

55 the latter case, for example , the importance of one video 
is calculated with several different methods to calculate 
the final importance as an average value or a maximum 
value. 
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[021 4] In the above embodimenr, ihe explanation has 
been given by cWmg the scene change quantity and the 
importance, However, it is possible to use one item of 
inlonnalion or a plurality of Items of information (de- 
scribed in the frame Information) together with the scene s 
change quantity or the Importance or Instead of the 
scene change quantity or importance. 
[0215] Next, there will bo explained a case in which 
infonnation for the control of reproduction/non-repro- 
duction Is added to the frame infomnatioh {see FIG. 1). io 
[021 6] It is desired that either only a specific scene or 
a part thereof (for example, a high-light scene) or only 
a scene or a part thereof in which a specific person ap- 
pears is reproduced. Thus, there is a demand of watch- 
ing only a portion of the video, rs 
[021 7] in order to satisfy this desire, the reproduction/ 
non-reproduction* Information may be added to the 
frame infonnallon for controlling the reproduction or the 
non-reproduclion. As a consequence, only a part of ihe 
video is reproduced or only a part of the video is not 
rqDroduced on the basis of the reproduction/non-r^pro- 
duction infomnation. 
[021 8] FIGS. 44f 45, and 46 show examples of a data 
structure in which the reproduction/non-reproduction in- 
fonnation is added. 25 
[0219] FIG. 44 shows a data structure in which the r&- 
production/non-reproduction information 123 is added 
to the data structure of FIG. 8. FIG. 45 shows a data 
structure in which the reproduction/non-reproduction in- 
formation 1 23 Is added to the data structure of FIG. 34, 3o 
FIG. 46 shows a data stmcture in which the reproduc- 
tion/non-reproduction information 123 is added to the 
data structure of RG, 35, Though not ghown, it is pos- 
sible to add the rsproduction/non-reproductlon infonna- 
tion 1 23 to the data stmcture of FiG. 1 . 35 
[0220] TTie rep roducti on/no n-reproductlon infonna- 

Tion 1^"maylDe 'BiriaiyinTo^ 

the video is reproduced or not or a continuous value 
such as reproduction level or the lil<e. 
[0221] For example, in the latter case, when the re- 40 
production level exceeds acertain threshold value atthe 
time of reproduction, the video is reproduced. When the 
reproduction level is less than the threshold value, the 
video is not reproduced. The user can directly or indi- 
rectly specify the threshold value, 45 
[0222] The reproduction/non-reproductjon infonna- 
tion 123 may be set as independent infomialion to be 
stored. If the reproduction or non-reproduction is selec- 
tively specified, the non-reproduction can be specified 
when the display time shown in the display time infor- so 
mation 121 is set to a specific value (for example, 0 or 

when the importance indicated by the importance inf or^^^ 
mation 122 is set to a specific value (for. example, 0 or 
-1 ), The reproduction/non-reproduction information 1 23 55 
may not be added. 

[0223] Ifthereproductionornon-reproduction is spec- 
ified with a level value, the display time Information 1 21 



and/or Ihe importance information 122 (represented by 
the level value) can be used as a substitute. 
[0224] If the reproduction/non-reproduction infonna- 
tion 123 Is maintained as Independent information, the 
quantity of data increases by that quantity It is possible 
to see a digest of the video by allowing the non-repro- 
duction specification portion not to be reproduced on the 
reproduction side, It is also possible to see the whole 
video by reproducing the non-reproduction specified 
portion. If the reproduction/ non-reproduction informa- 
tion 123 is not maintained as Independent infonnation, 
It is necessary to appropriately change the display time 
specified, for example, as 0 in order to see the whole 
video by reproducing the non-reproduction specified 
portion. 

[0225] The reproductlon/non-reproduction infonna- 
tion 123 may be Inpu l by man or may be determined with 
some conditions. For example, when the motion Infor- 
mation of the video Is set to a constant value or more, 
the video is reproduced. When the motion infotmalion 
of the video is not set to a constant value or more, the 
video is not reproduced so that only brisk motion portion 
can be reproduced. When it is detennined that the skin 
color is larger or smaller than the constant vaiuo from 
color infonnation, only the scene where man appears 
can be reproduced, A method for calculating the infor- 
mation with the magnitude of sound, and a method for 
calculating the information from the reproduction pro- 
gram infomnation which is input in advance can be con- 
sidered. The importance may be calculated with some 
technique to create the reproduction/non-reproduction 
information 123 from the importance information. When 
the reproduction/no.n-reproduction infonnation is set to 
a continuous value, the importance may be calculated 
by converting the infonnation into the reproductlon/non- 
reproductlon infonnation. 

"[D2"2"6] FfGT"47"sfr6wsafrexample in which reproduc- 
tion/ non-reproduction control is carried out so that video 
is reproduced bn the basis of the reproductlon/non-re- 
production information 123. 

[0227] In FIG. 47, jt Is supposed that the original video 
2151 is reproduced on the basis of the video frame lo- 
cation infonnation represented with F^ through Fg or the 
video frame group location infonnation 2153 and the dis- 
play time infonnation represented with through Dg. 
At this time, it is supposed that the reproduction/non- 
reproduction infomnation is added to the display lime in- 
fonnation 2154. in this example, the sections of D.j, D2, 
D4 and Dg can be reproduced, and othersections cannot 
be reproduced, the sections of D^, Dg, D4 and Dg are 
continuously reproduced as the reproduction video 

;21-52-<whne othe^^^^^ be Feprodueed)r; 

[0228] For example, in the f rame Fj'of the'reproduc-""" 



tion video, if the display time is set to D+, when the re 
produclion/non-reproduction Infonnation 123 shows re- 
production, and the display time is set to D'-, when the 
reproduction/non-reproductlon information 123 shov/s 
the non-reproduction, Sp+j = T when the total time of 



35 



EP1 168 840 A2 



36 



the reproduction portion of the original video is set to T. 
Normally, the display time of D+j is set to a time wliich 
is required to reproduce tlie original video at a normal 
speed. Tiie reproduction speed may be set to a prede- 
termined high-speed. Infonnation may be described as 
to which times the speed is to be set. When it is desired 
that the video is reproduced at N times high-speed, the 
display time D+| of the reproduction portion is multiplied 
by 1/N times. For example, in order to perfomi repro- 
duclion at the predetermined time D', the display time 
D'j ol each reproduction portion may be processed and 
displayed at D'/S|D+i times. 

[0229] If the display time of each frame (or a frame 
group) is determined on the basis of the frame infomna- 
tion. the detemfiined display time may be adjusted. 
[0230] In a method in which the calculated display 
lim e is not adjusted, the display time which Is calculated 
wilhoui taking into consideration the generation of the 
non-reproduction section is used as il is» so thai when 
Ihe display time exceeding 0 is originally allocated to the 
non-reproduction section the whole display time Is 
shortened for that allocation portion. 
[0231] In a method in which the calculated display 
time isadjusted. for example, if the display tme exceed- 
ing 0 is originally allocated to the non-reproduction sec- 
lion, the adjustment is made by multiplying by a constant 
number the display time of each of the frames (or the 
frame group) to be reproduced so that the whole display 
time becomes equal to the time at the time of the repro- 
duction of the non-reproduction section. 
[0232] The user may make a selection as to whether 
the adjustment is to be made. 
[0233] If the user specifies the N times reproduction, 
the N limes high-speed reproduptlon processing may be 
conducted without the adjustment of the calculated dfs- 
play time. The N times high-speed reproduction 



that the reproduction is not to be conducted, the frame 
is not displayed and the processing is moved to the next 
frame processing. 

[0238] It is determined at step S151 whether or not 
5 the whole video to be reproduced is processed. When 
the whole video is processed, the reproduction process- 
ing is also ended. 

[0239] When It is determined that the frame is to be 
reproduced or not at step S163, It is desired In some 

10 cases that the determination is depending on the taste 
of the user. At this time, It Is determined from the user 
profile whether or nptthe non-reproduction poition is re- 
produced in advance before the reproduction of the vid- 
eo. When the non-reproduction portion is reproduced, 

15 the frame is reproduced without fall at step S1 64. 
[0240] In addition, when the reproduction/non-repro- 
duction Information is described as a continuous value, 
a threshold value Is determined from the user profile for 
differentialing the reproduction and the non-reproduc- 

20 tion to determine the reproduction or the non-reproduc- 
tion depending on whether or not the reproduction/non- 
reproduction infonnation exceeds the threshold value. 
Except for using the user profile, for example, the 
threshold value is calculated from the importance setfor 

25 each frame, or infonmation may be received In advance 
from the user as to whether the reproduction- or non-re- 
production is provided in realtime. 
[0241] In this manner, it becomes possible to repro- 
duce only a portion of the video by adding to the frame 

30 information the reproduction/non-reproduction informa- 
tion 123 for controlling whether the video is reproduced 
or not with the result that it becomes possible to repro- 
duce only th e high-light gcene o r only the scene in which 
a man or an object of interest appears. 
35 [0242] Next, there will be explained a describing 
method if the location infomnation of media (for example, 



processing may be conducted on the basis of the display 
time after the adjustment of the calculated display time 
in the above manner (the display time of the former be- 
comes shorter). 

[0234] The user may specify the whole display time. 
1 n this case as well, for example, the display time of each 
frame (or a frame group) to be reproduced is multiplied 
by aconstant numbertomake an adjustment so that the 
display lime becomes equal to the specified whole dis- 
play time. 

[0235] FIG. 48 shows one example of the processing 
procedure for reproducing only a portion of the video on 
the basis of the reproduction/non-reproduction infonna- 
tion 123. 

[0236] At step S1 62, the frame inlormation (video lo- 

rEatioinnforTnitrDh^^ 
to determine whether the frame is to be reproduced from 

: l[fiF7epfp1^JiIctlon/n^^ 
display time infonnation at step S1 63. 
[0237] When it is determined that the reproduction is 
to be conducted, the frame Is displayed for the portion 
of the display time at step S1 64. When it is determined 



Text bTsoijn"d)" oth~enhan th'e'vldeo associstsd wlth"thiB~ 
video to be displayed, and time for displaying or repro- 
ducing the video is added to the frame infonnation (see 
■ 40 FIG. 1) as additional infonnation. 

[0243] In FIG, 8, the video location Information 101 
and the display time information 102 are included in 
each frame infonnation 100. In FIG. 34, the video loca- 
tion information 101 and Impoitance information 122 are 
45 Included In each frame Information 100. In FIG, 35, the 
video location Information 1 01 . the display time Intorma- 
\}on 121, and importance information 122 are Included 
in each frame infoimation 1 00. In FIGS. 44, 45, and 46, 
there Is further shown an example in which the repro- 
50 duction/non-reproduction information 1 23 is included in 
each frame infonnation 100. In any example, 0 or more 
— — soiind--loeation-lnfornri_ation-2703r-sound-^^^^ 

time information"27b4, 6 or more text infonnation' 2705'" 

and-text-display-time-information-2^06-(howeve»7-1-or- 

55 more in any of the infomnation) may be added. 

[0244] FIG. 49 shows an example in which one set of 
sound location information 2703 and sound reproduc- 
tion time infonnation 2704 and N sets of text information 
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2705 and texl display lime Information P-706 are added 
lo an example of the data structure of FIG. 8. 
[0245] The sound is reproduced for the time Indicated 
by the sound reproduction time Infonnatlon 2704 from 
the location indicated by the sound localion Infonnation 
2703. An object of reproduction may be sound infonna- 
tion attached to the video from the beginning. Back- 
ground music Is created to be newly added. 
[0246] The text displays the text infonnatlon Indicated 
by the text infonnation 2705 for the time indicated by the 
text display time infonnation 2706. A plurality of items 
of text infonnation may be added to one video frame. 
[0247] The time when the sound reproduction and the 
text display are started is the same as the time when the 
associated video frame is displayed. The sound repro- 
duction time and the text display time are set within the 
range of the associated video frame lime. If continuous 
sound is reproduced over a plurality of video frames, the 
sound location infonnallon and the reproduction lime 
may be set to be continuous. 

[0248] With such a method, summarj2ed sound and 
summarized text can be made possible. 
[0249] FIG. 50 shows one' example of a method for 
describing the sound information separately from the 
frame Information. This Is an example of a data structure 
for reproducing sound associated with the video frame 
which is displayed at the time when the special repro- 
duction is conducted. A set of the location information 
2801 showing the location of the sound to be repro- 
duced, reproduction start time 2802 when the sound re- 
production Is started, and reproduction time 2803 when 
the reproduction is continued Is set as one item of sound 
infonnation 2800 to be described as an anrangement of 
this sound information. 

[0250] FIG. 51 shows a data structure for describing 
the text information. The data structure has the same 
-stnjctareB-s-the-sowdlntawalioT^^^ 
of character code location Information 2901 of the text 
to be displayed, a display start time 2902, and a display 
time 2903 js set as one item of text infonnatlon 2900 to 
be described as an arrangement of this sound Informa- 
tion. As Information corresponding to the character code 
location information 2901 , instead of the character code 
location infonnation 2901 , the location information may 
be used which Indicates a location where the character 
code is stored, or a location where the character is 
stored as a video. 

[0251] The above sound Infonnation or the text infor- 
mation is synchronized with the display of the video 
frame to be displayed as infonnatlon associated with the 
video frame or a constant video frame section In v^hich 
the-displayed.video4rame-is-present^s-fihow-n-in^4G.— 
"B2, the repr6ductiori"or the dfepfeylof "the sound ]rif6r-~" 
mation or the text infonnatlon Is started with the lapse 
of time shown by the time axis 3001 . In the beginning. . 
the video 3002 is displayed and reproduced for the de- 
scribed display time In an order In which the respective 
video frames are described. Reference numerals 3005, 
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3006 and 3007 denote respective video frames and a 
predelermined display time is allocated thereto. The 
sound 3003 Is reproduced when the reproduction start 
time described in each sound information comes, When 
the reproduction time described in a similar manner has 
passed away, the reproduction Is suspended. As shown 
in FIG. 52. a plurality of sounds 3008 and 3009 may be 
reproduced. In a similar manner as the sound, the texl 
3004 is also displayed when the display time described 
in the each of the text Infonnation comes. When the dis- 
play time which is described has passed away, the dis- 
play Is suspended. A plurality of texts 3010 and 3011 
may be displayed at the same time. 
[0252] It Is nol required that the sound reproduction 
start time and the loxt display start time coincides wllh 
the lime al which the video frame is displayed. It is not 
required that the sound reproduction time and the texl 
display time coincides with the display time of the video 
frame. These limes can be freely set, on the contrary, 
the display time of the video frame may be changed in 
accordance with the sound reproduction time and the 
texl display time. 

[0253] It is possible that these times can be manually 
set by man, 

[0254] In ordertoomitthetroubleof detennination by 
man, it is preferable to detemnine a phenomenon which 
is likely to appear In the video scene which seems to be 
Important and to automatically set these times. Herein- 
after, several examples of automatic setting are shown. 
[0255] FIG. 53 shows one example of a processing 
procedure in which a continuous video frame section is 
detemiined which is referred to as ashotfrom achange- 
over of the screen up to the next change-pver of the 
screen, so that the total of the display time of the video 
frames included in the shot is defined as the sound re- 
productlontime. FIG. 63 is also established as a function 
"b1ock""diagram. 

[0256] At step S3101, the shot is detected from the 
video. For this purpose, there are used such methods 
as a method for detecting a cut of a motion picture from 
the MPEG bit streams using a tolerance ratio detection 
method. (The transactions of the Institute of electronics, 
information and communication engineers, Vol. J82-D- 
II, No. 3, pp. 361-370, 1999) and the like. 
[0257] At step S31 02, the video frame location Infor- 
mation is refen-ed tothereby investigating which shot re- 
spective video frames belong to. Furthemriore, the dis- 
play times of respective shots are calculated by taking 
the total of the display times of the video frames. 
[0258] ■ For example, the sound location information is 
set as the sound location corresponding to the start of 
^the-shot^e-sound-rfiproduction-start-iime^ay-^ 
Ibwedto coincide with the dispiaytlrne of fhe'inltiai'video" 
frame v/hich belongs to each shot while the sound re- 
production time may be set to be equal to the display 
time of the shot. Othenwise, in accordance with the re- 
production time of the sound, the display time of the vid- 
eo frames included in each shot may be con-ected. Al- 
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though the shot is dslecled here, if a data structure is 
tal^en wherein the impoitance information is described 
in the frame information, the section having importance 
exceeding the threshold value is determined by using 
the importance with respect to the video Irame so that 
the sound included in the section may be reproduced, 
[0259] If the determined reproduction time does not 
meet a constant reference, the sound may not be repro- 
duced. 

[0260] FIG, 54 shows one example of a processing 
procedure in which important words are taken out from 
sound data corresponding to the shot or the video frame 
section having the high importance with sound recogni- 
tion so that the words, or the sound including the words 
or the'sound in which a plurality of words are combined 
are reproduced. FIG. 54 is also established asaf unction 
block diagram, 

[0261] At step S3201 ,th8sbotls detected. In place of 
the shot, the video frame section having the high impor- 
tance is calculated. 

[0262] At step S3202, the sound recognition is caried 
out with respect to the sound data section correspond- 
ing to the obtained video frame section. 
[0263] At step S3203, sounds including the important 
word portion or sounds of the Important word portion are 
detenTiinedfrom the recognition result. In order to select 
the important words, an important word dictionary 3204 
IS referred to. 

[0264] At step S3205, the sound for reproduction is 
created. Continuous sounds including the important 
words may be used as they are* Only important words 
may be extracted. Sounds having a combination of a 
plurality of important words may be created. 
[0265] At step S320B, In accordance with the repro- 
duction time of the created time, the display time of the 
video frame is corrected. However, the number of -se- 
Tected 'words"tTr^ the repfdauction 

time of the sound may be shortened so that the sound 
reproduction time is set to be within the display time of 
the video frame. 

[0266] FIG, 55 shows one example of a procedure in 
which text infonnation Is obtained from the telop. FIG. 
55 is also established as a function block diagram. 
[0267] In the processing of FIG. 55, the text infonna- 
tion is obtained from the telop or the sound displayed in 
the video. 

[0268] At step S3301 , the telop displayed In the video 
is read. This includes a method in which the telop In the 
original video is automatically extracted or the telop Is 
read by man to be manually input with a method or the 
like described in, for example, a method described in a 

_ [iteT&mr e:such:a*s:;'.j?n^ G^f;?cyr)g'^.®£'?.?';^9.Mt: 

portion from the video for the telop region" by Osamu 

••-HoTi,:CViMMi-4r!^^ 

[0269] A Step S3302, Important words are taken out 
from the telop character string which has been read. In 
the judgment of important words, an important word dic- 
tionary 3303 is used. The telop character string which 



is read may be text Information as it Is. Extracted words 
are arranged, and a sentence representing the video 
frame section may be constituted wilh only the important 
words to provide text information . 
5 [0270] FIG: 56 shows one example for obtaining the 
text Infonnation from the sound. FIG. 56 Is also estab- 
lished as a function block diagram. . 
[0271] In the sound recognition processing at step 
S3401 , sound Is recognized, 
10 [0272] At step S3402, important words are taken out 
from the recognized sound data. In the judgment of im- 
portant words, an Important word dictionary 3403 is 
used. The recognized sound data may be used as test 
Infonnation. Extracted words are arranged, and a sen- 
t5 tence is constituted which represents the video frame 
section with only the important words to provide text in- 
fonnation. 

[0273] FIG. 57 shows an exampie of processing pro- 
cedure For taking out text information and preparing the 
20 text information with telop recognition from the shot or 
from the video frame section having high importance, 
FIG. 57 is also established as a function block diagram. 
[0274] At step S3501 , the shot is detected from the 
video. Instead of the shot, the section having high Im- 
25 portance may be detennlned. • 

[0275] Atstep S3502, the telop represented in the vid- 
eo frame section is recognized. 
[0276] At step S3603, the important words are ex- 
tracted by using an important vyord dictionary 3504. 
30 [0277] At step S3505. text for the display is created. 
For this purpose, a telop character string including im- 
portant words may be used. Only important words or a 
character string ysingthe Important words may be used 
as text infonnation. If text information is obtained by 
35 sound recognition, the telop recognition processing at 
step S3502 Is subjected to sound recognition process- 

TngTdlhpufsoulTd*^afa:Thete^^ 

together with the video frame In which the text is dis- 
played as telop or video frame of the time at which the 
40 data is reproduced as sound. Otherwise, text Infonna- 
tion In the video frame section may be displayed at one 
time. 

[0278] FIGS. 58A and 58B are views showing a dis- 
play example of the text information. As shown in FIG. 
45 58A, the display may be divided Into the text infonnation 
display area 3601 and the video display area 3602. As 
shown in FIG. 5BB, the text Inronrnation may be over- 
lapped with the video display area 3603. 
[0279] Respective display times (reproduction times) 
50 of the video frame, the sound infonnation and the text 
information may be adjusted so that all the media infor- 
mation is-synchroni2ed.~For-examplc,^t-the-time-of-the. 



double speVcT reproduction "of" the Videb*, "Important 
sounds-are-extraGted-by-the-above.^ethodT-and-a-half^ 
55 time sound information of the normal reproduction is ob- 
tained. Next, the display time is allocated to the video 
frame associated with respective sounds. If the display 
time of the video frame is determined so that the scene 
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change quantity becomes conslanl, the sound repro- 
duction time or the text display lims is set to be within 
the display time of the respectively associated video 
Irannes, Otherwise, a section Including a plurality of vid- 
eo frannes is determined lil<e the shot, so that the sound s 
or the text included in the section Is detemnlned or dis- 
played in accordance with the display time of the sec- 
lion. 

[0280] So far there has been explained video data as 
its main focus. However, the data structure of the io 
present invention can be modified to a data having no 
frame information, i,e,, the sound data. It is possible to 
use sound information and text information In the form 
without the frame information, in this case, a summary 
is created which comprises only sound information or ^5 
loxi loiormation with respect lo the original video data. 
In addition, a summary can be created which comprises 
only sound inf omiation and text Infomriation with respect 
lo the sound data and music data. 

[0281 ] Though the data structures shown in FIGS. 50 .50 
and 51 are used to describe the sound InfonnatlDn and 
text infomiation in synchronization with the video data, 
it is possible to summarize the sound data and text data 
only. To summarize the sound data, the data structure 
shown in FIG. 50 can be used irrespective of the video 
infomnatlon.To summarize the text data, the data struc- 
ture shown in FIG. 51 can be used irrespective of the 
video information. At that time, in the same manner as 
In the case of the frame information, the original data 
information may be added to describe a con^espond- 30 
ence relationship between the original sound and music 
data to the sound information and text infomiation. 
10282] FIG. 59 shows an example of a data structure 
in which the original data infomnation 4901 is included 
in the sound information shown in FIG. 50< If the original as 
data is the video, the original data information 4901 In- 

"dicatesihHSBCtioTinDfT?ide0frames-^staitpointinfo7rTraF 

tton 4902 and section length information 4903). 
10283] If the original data is sound data and music da- 
ta, the original data infomnation 4901 indicates the sec- 40 . 
tion of sound and music. 

[0284] FIG, 60 shows an example of a data structure 
in which the original data Infomiation 4901 is Included 
in the sound Infotmation shov/n in FIG. 30. 
[0285] FIG. 61 explains an example in which sound/ 45 
music Is summarized by using the sound Information. 
The original sound/music is divided into several sec- 
tions, A portion of the section is extracted as the sum- 
marized sound/music so that the summary of the origi- 
nal data is created. For example, a portion 5001 of the 50 
section 2 is extracted as summarized sound/music to 

-bej:eproducejdas.a.sectioa£0j02.oLtiie.surnmany-j?^s_an 

example of a method for dividing'the sectldri, the miisic 
may be divided into chapters and the conversation may 
be divided by the contents, 55 
[028S] Furthermore, in the same manner as in the 
esse of the frame Information, the' description of the orig- 
inal data file and the section are Included in the sound 



infomnation and thelext information with the resull that 
a pluralily of sound/music data items can be summa- 
rized together. At this time, if Idenlification Infomiation 
is added to the individual original data, the original data 
identification Infomnation may be described in place of 
the original data file and the section. 
[0287] FIG, 62 explains an example in which sound/ 
music is summarized by using the sound Information. 
Portions of plural sound/music data Items are extracted 
as the summarized sound/music so that the summaiy 
of the original data is created. For example, a portion 
5001 pf the soundAnusIc data item 2 Is extracted as 
summarized sound/music to be reproduced as a section 
5102 of the summary. A piece of music included in one 
music album Is extracted by a pori:ion of the section, so 
that a summarized data for trial can be created as a us-- 
age. 

[0288] if an album Is summarized, the title of the music 
may be included in the music information when It is pref- 
erable that the title of the music can be known. This in- 
formation is not indispensable. 
[0289] Next, a method of providing video data will be 
explained. 

[0290] If the special reproduction control information 
created in the processing of the embodiment is provided 
for the use, It is necessary to provide the special repro- 
duction control infonnation from the side of those who 
create the infomiation to the side of the user with some 
means. As this method of providing the special repro- 
duction control infonnation, various forms can be con- 
eidered as exemplified below: 

(1) Video data and special reproduction control in- 
formation are recorded on one (or a plurality of) re- 
cording medium (or media) and provided at the 
same time; 

\2) Vfdet> datais-recorded^n-on e- (Dr-a-pluralityof}- 
recording medium (ormedia) and provided, and the 
special reproduction control infonnation is sepa- 
rately recorded on one (or a plurality of) recording 
medium (media) and provided; 

(3) Video data and the special reproduction control 
information are provided via the communication 
medium at the same occasbn; 

(4) Video data and the special reproduction control 
Infomiation are provided via the communication 
media at different occasions. 

[0291] According to the above described emibodi- 
ments, a special reproduction control information de- 
scribing method for describing special reproduction con- 
-±'o|jnfojmatipn-providfidj[Qr^specia|j:epri>ductibji,j^ 
" respect tb the video"" co'ntenfs describes , "as the frame " 
information, for each of frames or groups of continuous 
or adjacent frames selectively extracted from the whole ■ 
frame series of video data constituting the video con- 
tents, first inf omnation showing a location at which video 
data of the one frame or one group is present and sec- 
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ond information associaled witli display time allocated 
to the one Irame or the frame group, and/or third infor- 
mation showing importance allocated to the one frame 
□r the frame group corresponding to the frame Informa- 
tion. 

[0292] According to the above described embodi- 
ments, a computer readable recording medium storing 
a special reproduction control information stores at least 
frame information described for each of frames or 
groups of continuous or adjacent frames selectively ex- 
traded from the whole frame series of video data con- 
stituting the video contents, the frame information com- 
prising first information showing a location at which vid- 
eo data of the one frame or one group is present and 
second Infomiation associated with display time allocat- 
ed to the one frame or the frame group, and/or third in- 
formation showing Importance allocated to the one 
frame or the frame group corresponding to the frame 
information. 

[0293] According to the above described embodi- 
ments, a special reproduction control infoimation de- 
scribing apparatus/method for describing special repro- 
duction control information provided for special repro- 
duction with respect to the video contents describes, as 
the frame Infomnation, lor each of frames or groups of 
continuous or adjacent frames selectively e>ctractBd 
from the whole frame series of video data constituting 
the video contents, video location information showing 
a location at which video data of the one frame or one 
group is present and display time control infomiation In- 
cluding display time information and basic infomnation 
based on which the display time is calculated, to be al- 
located to the one frame or the frame group. 
[0294] According to the above described embodi- 
ments, a special reproduction apparatus/method which 
enables a special reproduction wi^ 
'contenis, wherein special reprod'uctibn'control informa- 
tion is refen'ed to which Includes at least frame infomna- 
tion including video location infomnation showing a lo- 
cation at which one frame data or one frame group data 
is present which infomnation is described for each of the 
frame groups comprising one frame selectively eKtract- 
ed out of the whole frame series of the video data allo- 
cated to the video contents and constituting the video 
contents ore plurality of continuous or adjacent frames; 
the one frame data or the frame gro up data correspond- 
ing to each frame Informalion Is oblalned on the basis 
of video location information included in the frame infor- 
mation while the display time which should be allocated 
to each frame infomnation Is detemnlned on the basis of 
display time control infomnation included in at least each 
.:[raMlrtf6Tfnaii6n"mididata:onn 
raiity of frames which is or are obtained is reproduced 
atthe-detenriineddisplay-timein-apredetermined-ofder 
thereby carrying out a special reproduction. 
• [0295] in the above described embodiments, for ex- 
ample, image data is created in advance, which is ex- 
tracted in frame units from location infomnation on an 



effective video frame or an original video which Is used 
for display, and the video frame location information or 
information on the display time of the Image data is cre- 
ated separately from the original video. Either video 
5 frames orthe image data extracted from the original vid- 
• eo is continuously displayed on the basis of the display 
information so that a special reproduction such as a dou- 
ble speed reproduction, a trick reproduction, jump con- 
tinuous reproduction orthe like is enabled. 
10 [0296] In the double speed reproduction for confirm- 
ing the contents at a high speed, display time is deter- 
mined In advance In such a mannerthatthe display time 
is extended at a location where a motion of the scene is 
large while the display time is shortened at a location 
is where the motion is small so that the change in the dis- 
play screen becomes constant as much as possible. Al- 
ternatively, the same effect can be obtained even when 
the location information is determined so that an interval 
of the extracted location Is made small aL a locaLion 
20 where a motion of the video frame or video data used 
for the display Is large while the interval is made small 
at a location where the motion Is large. A reproduction 
speed control value may be created so that a double 
speed value or a reproduction time is provided which is 
25 designated by a user as a whole. A long video can be 
viewed at double speed reproduction, so that the video 
can be easily viewed in a short time, and the contents 
can be grasped In a short time. 
[0297] it is possible to reproduce videos so that im- 
30 portant locations are not overlooked by extending the 
display time at the important locations and shortening 
the display time at unimportant locations In accordance 
with the importance of the video. 
[0298] Only Important locations may be efficiently re- 
35 produced by partially omitting a part of the video without 
displaying the whole video frame, 
^^jgpg J AccbfdThg'to emboaifnentS'of 'th'e pre's'ent in- 
vention, an effective special reproduction is enabled on 
the basis of the control information on the reproduction 
40 side by arranging and describing as control infomnation 
provided for especial reproduction of the video contents 
a plurality of frame information including, a method for 
obtaining a frame or a group of frames selectively ex- 
tracted from the original video, information on the dls- 
45 play time (absolute or relative value) allocated to the 
frame or the group of frames and Information which 
forms the basis for obtaining the infomnation on the dis- 
play time. 

[0300] For example, each of the above functions can 
50 be realized as software. The above embodiments can 
be realized as a computer readable recording medium 

on whicha program4s-recorded forallowing the-compu^ 

ler to conduct'predetermined" means or for alldwihg the " 

comput6r-tofunctlon-as.predetermined.means,.or.f.oraU. 

55 lowing the compulerto realize a predetennined function. 
[0301] The structures shown In each of the embodi- 
ments are one example, and are hot intended to exclude 
other structures. It is also possible to provide a structure 
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Which is obtained by replRcjng a part of the stmcrure 5, 
exemplified above with anotliGrslruGlure, omilling a part 
of the exemplified structure, adding a different function 
io the exemplified stmciure. and combining such meas- 
ures. A difTerenl structure logically equivalent to the ex- s 
empllfled structure, a different structure including a part 
locfically equivalent to the exemplified structure, and a 
different structure logically equivalent to the essential 6. 
portion of the exemplified structure can be provided. An- 
other structure identical to of similar to the exemplified io . 
structure, or a different structure having the same effect 
as the exemplified structure or a similar effect can be 
provided. 

[0302] In each of the embodiments, various variations 7. 
with respect to various structure components can be put 
Into practice in an appropriate combination. 
[0303] Each of the embodiments includes or inherent- 
ly contains an invention associated with various view- 
points, stages, concept or a category such as, for ex- 
ample, an invention as a method for describing infonna- 
tlon, an invention as information which is described, an 
invention as an apparatus or a method comesponding 
thereto, an invention as an inside of the apparatus or a 
method con-esponding thereto. 
[0304] Consequently, the Invention can be extracted 
without being limited to the exemplified structure from 
the content disclosed in the embodiment according to 
this invention. 
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Claims 

1, A method of de^ciilajng fram^ infonnatiqn, the 

method characterized by comprising: 10 

describing, for a frame extracted from a plural- 
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ItyoTTrameslrfa source"viaeFUaIaTTir^ 
mation (101) specifying a location of the ex- 
tracted iframe in the source video data; and 
describing, for the extracted frame, second in- 
fonnation (102) relating to a display time of the 
extracted frame. 

The method according to claim 1 , characterized in 
that the extracted frame comprises a group of 
frames, and the first infonnation comprises Infomna- 
tion specifying a location of the extracted group of 
frames in the source video data. 



The method according to claim 1 , characterized in 
that the extracted frame comprises a frame extract- 
ed from a plurality of frames Included in a temporal 
section of the source video data, and further de- 
scribing fourth Information specifying the temporal 
section of the source video data. 

The method according to claims, characterized in 
that the first infonnation comprises Infoimatlon 
specifying an image data file created from the 
source video data pf the extracted frame, the image 
data corresponding to the extracted frame. 

The method according to any one of the preceding 
claims, characterized in that the second informa- 
tion comprises Infonnation relating to such display 
time that a frame activity value during a special re- 
production Is kept substantially constarit. 

The method according to any one of the preceding 
claims, characterized by further comprising de- 
scribing fifth infonnation (123) indicating whether 
the extracted frame is reproduced or not. 

The method according to claim 1 , characterized 
In that the first information comprises one of infor- 
mation specifying a location of the extracted frame 
among the plurality of frames and infonnation spec- 
ifying a location of image data within an image data 
file created from the source video data and stored 
separately from the video data, the image data cor- 
responding to the extracted frame. 

The method according to any one of the preceding 
claims, characterized by further comprising de- 
scribing, for media data otherthan the source video 
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3, 



The method according to claim 1 or 2, character- 
ized by further comprising describing, for the ex- 
--jrtfacted^ 

portan'ce of tlie e)rtracfed'f rarn 

4. The method according to claim 1 , 2 or 3, charac- 
terized in that the first infonnation comprises infor- 
mation sppcifying an image data file created from 
the video data of the extracted frame. 



so 



"data incfu'ding W i)Sracfea~r^ 
specifying a iocation of the m edia data and infonna- 
tion relating to a display time of the media data. 

An article of manufacture comprising a computer 
usable medium storing frame infomnation, theframe 
information characterized by comprising: 

first Information (101 ), describedfora frame ex- 
tracted from a plurality of frames, specifying a. 
location of the extracted frame In the source 

- video data: and 
second infonnation (102), described for the ex- 

. tracted frame, relating to a display time of the 
extracted frame. 
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The article of manufacture according to claim 177 
characterized in that the extracted frame compris- 
es a group of frames, and the first infonnation com- 
prises information specifying a location of the ex- 
tracted group of frames in the source video data. 
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13. The article of manufacture according 1o claim 11 or 
12, characterized in that the frame information 
comprises third information- (122) relating to impor- 
tance of the extracted frame. 

14. The article of mahulaclure according to claim 11,12 
or 13, characterized in that the first infonnation 
comprises Infonnation specifying an image data file 
created from the video data of the extracted frame. 

15. The article of manufacture according to claim 11, 
characterized by further comprising storing the 
source video data and an image data file con-e- 
sponding to the source video data of the extracted 
frame in addition to the frame information. 

16. An apparatus tor creating frame Infonnation, the ap- 
pariatus characterized by comprising: 

a unit configured to extract a frame from a plu- 
rality of frames in a source video data; 
a unit configured to create the frame Informa- 
tion including first infonnation specifying a lo- 
cation of the extracted frame and second infor- 
mation relating to a display time of the extracted 
frame; and 

a unit configured to Wnk the extracted frame to 
the frame information. 

17. Amethod of creating frame Infonnation, the method 
characterized by comprising: 



extracting a frame from a plurality of frames in 
a source video data; and 
creating the frame information Includingfirst in- 
formation sgeclfyingji location of the extracted 
frame in the source video data and second in- 
fonnation relating to a display time of the ex- 
tracted frame. 



10 
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19. A method of performing a special reproduction 
characterized by comprising: 

referring to frame information described for a 
frame extracted from a plurality of frames in a 
source video data and including first informa- 
tion (101) specifying a location of the extracted 
frame and second information (102) relating to 
a display time of the extracted frame; 
obtaining the video data corresponding to the 
extracted frame based on the first information; 
detennlning the display time of the extracted 
frame based on the second Information; and 
displaying the obtained video data for the de- 
termined display time. 
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An article of manufacture comprising a computer 
usable medium having computer readable program 
code means embodied therein, the computer read- 
able program code means performing a special re- 
production, the computer readable program code 
means characterized by comprising: 



computer readable program code means for 
causing a computer to refer to frame- informa- 
tion described for a frame extracted from a plu- 
rality of frames in a source video data and in- 
cluding first Infonnation (101) specifying a loca- 
tion of the extracted frame and second mfomna' 
tion (102) relating to a display time of the ex- 
tracted frame; 

computer readable program code means for 
causing a computer tQ obtain the video data 
corresponding to the extracted frame based on 
the first information; 

computer readable program code means for 
"causliig" a~cbm"p iH^^^ 
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18. An apparatus for perfonning a special reproduction, 
characterized by comprising: 



a unit configured to refer to frame information 
described tor aframe extracted from a plurality 
of frames In a source video data and including 
first information specifying a location of the ex- 
tracted frame in the source video data and sec- 
ond information relating to a display time of the 
extracted frame; 

a unit configured to obtain the video data cor- 
7rbTppnding to ih¥ 
first information; 
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• -a-unltconfigured-to -determine ihe-dispiay-time- 
of the extracted frame based on the second In- 
fonnation; and 

a unit configured to display the obtained video 
data for the detemnined display time. 



time of the extracted frame based on the sec- 
ond Infonnation; and 

computer readable program code means for 
causing a computer to display the obtained vid- 
eo data for the determined display time. 

21. A method of describing sound Infonnation, the 
method characterized by comprising: 

describing, for aframe extracted from a plural- 
ity of sound frames in a source sound data, first 
information specifying a location of the extract- 
ed frame In the source sound data; and 
describing, for the extracted frame, second in- 

formation-relating4o-a-reproduGtlon-starttime- 

"and reproduction time of the sound "data of the* 
e-xtracted-fpamei 
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22. An article of manufacture comprising a computer 
usable medium storing frame information, the frame 
information characterized by comprising: 
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firs! information, dsscrjbad for r frame extract- 
ed from a pluralily of sound frames, specifying 
a location of the extracted frame in the source 
sound data; and 

second Information, described for the extiBcted s 
frame, relating to a reproduction start time and 
reproduction time of the sound data of the ex- 
tracted frame, 

23. A method of describing text Information, the method io 
characterized by comprising: 

describing, for a frame extracted from a plural- 
ity of lexl frames In a source text data, first in- 
fonnation specifying a location of the extracted is 
frame In the source text data; and 
describing, for the extracted frame, second in- 
fomnation relating to a display start time and 
display lime of the lexl data of Ihe exlraclecj 
frame. so 

24. An article of manufacture comprising a computer 
usablemedium storingframe infomnation, the frame 
infonmation characterized by comprising: 

55 

first infomiatlon, described for a frame extract- 
ed from a plurality of text frames in a source 
text data, specifying a location of the extracted 
frame in the source text data; and 
second infomiation, described forthe extracted so 
frame, relating to a display start time and dis- 
play time of the text data of the extracted frame. 

25. A carrier medium carrying computer readable In- 
structions for controlling the computer to carry out 35 

the method of any one claims 1 to 10, 17, 19, 21 

ancf25. 
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